write() on pipe blocking due to in-page fragmentation?

From: Ricardo Nabinger Sanchez
Date: Fri Sep 23 2011 - 15:52:27 EST


Hello,

The simple program attached allocates a pipe, perform a number of
writes in it in order to fill the pipe, and then reads that data to
empty the pipe. The argument is used to determine how much data to
write per write iteration.

Values that are power of 2 up to PIPE_BUF work without any issues.
Other values may cause the write() call to block.

For example, on a 32-bit machine:

(gdb) r 3
Starting program: /home/rnsanchez/Taghos/sshfs/pipe-test/single32 3
Pipe: 00000/32
write: 65520..65523/65536^C
Program received signal SIGINT, Interrupt.
0xb7f2de5e in __write_nocancel () from /lib/libc.so.6
(gdb) bt
#0 0xb7f2de5e in __write_nocancel () from /lib/libc.so.6
#1 0x08048705 in main ()

Similar behavior on a 64-bit machine, different version of both Linux
kernel and glibc.

Intuitively, it seems that pages in the pipe are getting fragmented,
and eventually it will reach the limit of 16 pages and, if the data is
not consumed, will cause writers to block --- even though the data
would fit nicely otherwise.

Is this understanding correct? If so, is it something that should be
fixed in the Linux kernel?

Or should the application ensure that data written to the pipe will be
done carefully as to not block a writer?

Thank you in advance for your attention.

Regards

--
Ricardo Nabinger Sanchez http://www.taghos.com.br/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>

#define PIPES 1
#define BUFLEN 65536

struct pp {
int pfd[2];
};

int main(int argc, char *argv[]) {
unsigned int i;
unsigned int len;
int wlen;
struct pp *pvet = calloc(PIPES, sizeof(struct pp));
char *buf = malloc(BUFLEN);

if (argc != 2)
exit(1);

wlen = atoi(argv[1]);
if (wlen <= 0 || wlen > BUFLEN)
exit(2);

for (i = 0; i < BUFLEN; i++)
buf[i] = 'o';

// Create.
for (i = 0; i < PIPES; i++) {
if (pipe(pvet[i].pfd) != 0) {
perror("pipe");
abort();
}
}

// Fill pipes.
for (i = 0; i < PIPES; i++) {
fprintf(stderr, "Pipe: %05u/%d\n", i, PIPES);
for (len = 0; len < BUFLEN; len += wlen) {
fprintf(stderr, "\r write: %05u..%u/%d", len,
len+(unsigned)wlen, BUFLEN);
if (write(pvet[i].pfd[1], &buf[len], wlen) <= 0) {
perror("write");
abort();
}
}
fprintf(stderr, "\n");
}

// Consume data from pipes.
for (i = 0; i < PIPES; i++) {
fprintf(stderr, "Pipe: %05u/%d\n", i, PIPES);
if (read(pvet[i].pfd[0], buf, BUFLEN) != BUFLEN) {
perror("read");
abort();
}
}

exit(0);
}