Re: [PATCH] pipe: don't block after data has been written

From: Eric Dumazet
Date: Thu Nov 05 2009 - 11:28:25 EST


Max Kellermann a Ãcrit :
> According to the select() / poll() documentation, a write operation on
> a file descriptor which is "ready for writing" must not block. Linux
> violates this rule: if you pass a very large buffer to write(), the
> system call will not return until everything is written, or an error
> occurs.
>
> This patch adds a simple check: if at least one byte has already been
> written, break from the loop, instead of calling pipe_wait().
>
> Signed-off-by: Max Kellermann <max@xxxxxxxxxxx>
> ---
>
> fs/pipe.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/fs/pipe.c b/fs/pipe.c
> index ae17d02..9d84f0b 100644
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -582,7 +582,7 @@ redo2:
> }
> if (bufs < PIPE_BUFFERS)
> continue;
> - if (filp->f_flags & O_NONBLOCK) {
> + if (filp->f_flags & O_NONBLOCK || ret > 0) {
> if (!ret)
> ret = -EAGAIN;
> break;
>

Then select()/poll() documentation is wrong, please correct documentation ?

http://www.opengroup.org/onlinepubs/000095399/functions/write.html

ssize_t write(int fildes, const void *buf, size_t nbyte);

If the O_NONBLOCK flag is clear, a write request may cause the thread to block,
but on normal completion it shall return nbyte.

Every Unix I know behaves the same when writing to a pipe.


Your patch breaks many programs, that dont use poll()/select()

char result[1000000];
main()
{
computethings();
write(1, buffer, 1000000);
}


$ ./program | more

Please learn how useful O_NDELAY can be in a poll()/select() environment.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/