Re: [PATCH] tcp: do not promote SPLICE_F_NONBLOCK to socket O_NONBLOCK

From: Evgeniy Polyakov
Date: Thu Jul 17 2008 - 10:21:57 EST


Hi Octavian.

On Thu, Jul 17, 2008 at 04:33:49PM +0300, Octavian Purdila (opurdila@xxxxxxxxxxx) wrote:
> This patch changes tcp_splice_read to the behavior implied by man 2
> splice:
>
> SPLICE_F_NONBLOCK - Do not block on I/O. This makes the splice
> pipe operations non-blocking, but splice() may nevertheless block
> because the file descriptors that are spliced to/from may block
> (unless they have the O_NONBLOCK flag set).
>
> This approach also provides a simple solution to the splice
> transfer size problem. Say we have the following common sequence:
>
> splice(socket, pipe);
> splice(pipe, file);
>
> Unless we specify SPLICE_F_NONBLOCK, we can't use arbitrarily large
> transfer sizes with the 1st splice since otherwise we will deadlock
> due to pipe being full. But if we use SPLICE_F_NONBLOCK, the current
> implementation will make the underlying socket non-blocking and thus
> will force us use poll or other async I/O notification mechanism.

Existing behaviour was selected to be able to have a progress if socket
does not have enough data to fill the pipe. With your change if socket
is not opened with non-blocking mode reading will block not matter if
SPLICE_F_NONBLOCK is set or not. This is a quite serious break of the
overall idea behind SPLICE_F_NONBLOCK.

Socket will not be marked as non-blocking if SPLICE_F_NONBLOCK is
specified, only splicing will used non-blocking reading, any read via
recv() will use existing socket flags.

--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/