RE: [RFC PATCH v7 01/19] Add a new structure for skb buffer fromexternal.

From: Xin, Xiaohui
Date: Wed Jun 09 2010 - 05:54:15 EST

>-----Original Message-----
>From: Herbert Xu [mailto:herbert@xxxxxxxxxxxxxxxxxxx]
>Sent: Tuesday, June 08, 2010 1:28 PM
>To: Stephen Hemminger
>Cc: Xin, Xiaohui; netdev@xxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx;
>linux-kernel@xxxxxxxxxxxxxxx; mst@xxxxxxxxxx; mingo@xxxxxxx; davem@xxxxxxxxxxxxx;
>Subject: Re: [RFC PATCH v7 01/19] Add a new structure for skb buffer from external.
>On Sun, Jun 06, 2010 at 04:13:48PM -0700, Stephen Hemminger wrote:
>> Still not sure this is a good idea for a couple of reasons:
>> 1. We already have lots of special cases with skb's (frags and fraglist),
>> and skb's travel through a lot of different parts of the kernel. So any
>> new change like this creates lots of exposed points for new bugs. Look
>> at cases like MD5 TCP and netfilter, and forwarding these SKB's to ipsec
>> and ppp and ...
>> 2. SKB's can have infinite lifetime in the kernel. If these buffers come from
>> a fixed size pool in an external device, they can easily all get tied up
>> if you have a slow listener. What happens then?
>I agree with Stephen on this.
>FWIW I don't think we even need the external pages concept in
>order to implement zero-copy receive (which I gather is the intent
>Here is one way to do it, simply construct a completely non-linear
>packet in the driver, as you would if you were using the GRO frags
>interface (grep for napi_gro_frags under drivers/net for examples).
I'm not sure if I understand your way correctly:
1) Does the way only deal with driver with SG feature? Since packet
is non-linear...

2) Is skb->data still pointing to guest user buffers?
If yes, how to avoid the modifications to net core change to skb?

3) In our way only parts of drivers need be modified to support zero-copy.
and here, need we modify all the drivers?

If I missed your idea, may you explain it in more detail?

>This way you can transfer the entire contents of the packet without
>copying through to the other side, provided that the host stack does
>not modify the packet.

>If the host side did modify the packet then we have to incur the
>memory cost anyway.
>IOW I think the only feature provided by the external pages
>construct is allowing the skb->head area to be shared without
>copying. I'm claiming that this can be done by simply making
>skb->head empty.
I think to make skb->head empty at first will cause more effort to pass the check with
skb header. Have I missed something here? I really make the skb->head NULL
just before kfree(skb) in skb_release_data(), it's done by callback we have made for skb.


>Visit Openswan at
>Email: Herbert Xu ~{PmV>HI~} <herbert@xxxxxxxxxxxxxxxxxxx>
>Home Page:
>PGP Key:
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at