sendfile and EAGAIN

From: Ulrich Drepper
Date: Mon Feb 25 2013 - 12:23:03 EST


When using sendfile with a non-blocking output file descriptor for a
socket the operation can cause a partial write because of capacity
issues. This is nothing critical and the operation could resume after
the output queue is cleared. The problem is: there is no way to
determine where to resume.

The system call just returns -EAGAIN without any further indication.
The caller doesn't know what to resend.

And this even though the interface of sendfile would be capable of
communicating this information and the man page (I know, it's not
authoritive) describes this behavior as well.

The problem is probably in a few places, here is one (fs/splice.c):

static ssize_t default_file_splice_write(struct pipe_inode_info *pipe,
struct file *out, loff_t *ppos,
size_t len, unsigned int flags)
{
ssize_t ret;

ret = splice_from_pipe(pipe, out, ppos, len, flags, write_pipe_buf);
if (ret > 0)
*ppos += ret;

return ret;
}

Note that *ppos is only updated if the call doesn't fail. We could
also update the position if ret == -EAGAIN. This would require
re-architecting the system a bit to either update *ppos in
splice_from_pipe etc or to communicate number of the bytes which are
written from the splice_from_pipe call. In any case, the result would
be that the caller knows where to resume the operation.

I would argue that this doesn't break the ABI. In case existing
programs today just resend packages today from the beginning they will
have send an unpredictable number of bytes in the previous sendfile()
call, making the state of the communication unpredictable.

Opinions? I think as is sendfile() isn't useful with O_NONBLOCK.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/