Re: Data corruption issue with splice() on 2.6.27.10

From: Jens Axboe
Date: Wed Jan 07 2009 - 03:18:28 EST


On Wed, Jan 07 2009, Herbert Xu wrote:
> On Tue, Jan 06, 2009 at 06:37:05PM +0000, Jens Axboe wrote:
> >
> > I'll give this a spin tomorrow as well. A hunch tells me that this is
> > likely a page reuse issue, that splice is getting the reference to the
> > buffer dropped before the data has really been transmitted. IOW, the
> > page is likely fine reaching the ->sendpage() bit, but will be reused
> > before the data has actually been transmitted. So once you get that far,
> > other random data from that page is going out.
>
> I see the problem.
>
> The socket pipes in net/core/skbuff.c use references on the skb
> to hold down the memory in skb->head as well as the pages in the
> skb.
>
> Unfortunately, once the pipe is fed into sendpage we only use
> page reference counting to pin down the memory. So as soon as
> sendpage returns we drop the ref count on the skb, thus freeing
> the memory in skb->head, which is yet to be transmitted.
>
> Moral: Using page reference counts on skb->head is wrong.

So my hunch was pretty close. The fix would seem to involve NOT calling
ops->release(pipe, buf) until we actually have an ACK on that data gone
out.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/