Re: kernel panic in skb_copy_bits

From: Eric Dumazet
Date: Thu Jul 04 2013 - 05:34:33 EST

Next message: Mel Gorman: "Re: [PATCH 13/13] sched: Account for the number of preferred tasksrunning on a node when selecting a preferred node"
Previous message: Mel Gorman: "Re: [PATCH 12/13] mm: numa: Scan pages with elevated page_mapcount"
In reply to: Ian Campbell: "Re: kernel panic in skb_copy_bits"
Next in thread: Ian Campbell: "Re: kernel panic in skb_copy_bits"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, 2013-07-04 at 09:59 +0100, Ian Campbell wrote:
> On Thu, 2013-07-04 at 16:55 +0800, Joe Jin wrote:
> >
> > Another way is add new page flag like PG_send, when sendpage() be called,
> > set the bit, when page be put, clear the bit. Then xen-blkback can wait
> > on the pagequeue.
>
> These schemes don't work when you have multiple simultaneous I/Os
> referencing the same underlying page.

So this is a page property, still the patches I saw tried to address
this problem adding networking stuff (destructors) in the skbs.

Given that a page refcount can be transfered between entities, say using
splice() system call, I do not really understand why the fix would imply
networking only.

Let's try to fix it properly, or else we must disable zero copies
because they are not reliable.

Why sendfile() doesn't have the problem, but vmsplice()+splice() do have
this issue ?

As soon as a page fragment reference is taken somewhere, the only way to
properly reuse the page is to rely on put_page() and page being freed.

Adding workarounds in TCP stack to always copy the page fragments in
case of a retransmit is partial solution, as the remote peer could be
malicious and send ACK _before_ page content is actually read by the
NIC.

So if we rely on networking stacks to give the signal for page reuse, we
can have major security issue.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Mel Gorman: "Re: [PATCH 13/13] sched: Account for the number of preferred tasksrunning on a node when selecting a preferred node"
Previous message: Mel Gorman: "Re: [PATCH 12/13] mm: numa: Scan pages with elevated page_mapcount"
In reply to: Ian Campbell: "Re: kernel panic in skb_copy_bits"
Next in thread: Ian Campbell: "Re: kernel panic in skb_copy_bits"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]