Re: [PATCH net-next V2 1/3] tap: use build_skb() for small packet

From: Jason Wang
Date: Wed Aug 16 2017 - 05:17:59 EST


This is a multi-part message in MIME format.

On 2017å08æ16æ 12:07, Jason Wang wrote:


On 2017å08æ16æ 11:59, Michael S. Tsirkin wrote:
On Wed, Aug 16, 2017 at 11:57:51AM +0800, Jason Wang wrote:

On 2017å08æ16æ 11:55, Michael S. Tsirkin wrote:
On Tue, Aug 15, 2017 at 08:45:20PM -0700, Eric Dumazet wrote:
On Fri, 2017-08-11 at 19:41 +0800, Jason Wang wrote:
We use tun_alloc_skb() which calls sock_alloc_send_pskb() to allocate
skb in the past. This socket based method is not suitable for high
speed userspace like virtualization which usually:

- ignore sk_sndbuf (INT_MAX) and expect to receive the packet as fast as
possible
- don't want to be block at sendmsg()

To eliminate the above overheads, this patch tries to use build_skb()
for small packet. We will do this only when the following conditions
are all met:

- TAP instead of TUN
- sk_sndbuf is INT_MAX
- caller don't want to be blocked
- zerocopy is not used
- packet size is smaller enough to use build_skb()

Pktgen from guest to host shows ~11% improvement for rx pps of tap:

Before: ~1.70Mpps
After : ~1.88Mpps

What's more important, this makes it possible to implement XDP for tap
before creating skbs.
Well well well.

You do realize that tun_build_skb() is not thread safe ?
The issue is alloc frag, isn't it?
I guess for now we can limit this to XDP mode only, and
just allocate full pages in that mode.


Limit this to XDP mode only does not prevent user from sending packets to
same queue in parallel I think?

Thanks
Yes but then you can just drop the page frag allocator since
XDP is assumed not to care about truesize for most packets.


Ok, let me do some test to see the numbers between the two methods first.

Thanks

It looks like full page allocation just produce too much stress on the page allocator.

I get 1.58Mpps (full page) vs 1.95Mpps (page frag) with the patches attached.

Since non-XDP case can also benefit from build_skb(), I tend to use spinlock instead of full page in this case.

Thanks