Re: Linux 5.12-rc7

From: Michael S. Tsirkin
Date: Tue Apr 13 2021 - 08:57:27 EST


On Mon, Apr 12, 2021 at 06:47:07PM +0200, Eric Dumazet wrote:
> On Mon, Apr 12, 2021 at 6:31 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> >
> > On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> > > >
> > > > Qemu test results:
> > > > total: 460 pass: 459 fail: 1
> > > > Failed tests:
> > > > sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> > > >
> > > > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not pull payload in
> > > > skb->head"). It is a spurious problem - the test passes roughly every other
> > > > time. When the failure is seen, udhcpc fails to get an IP address and aborts
> > > > with SIGTERM. So far I have only seen this with the "sh" architecture.
> > >
> > > Hmm. Let's add in some more of the people involved in that commit, and
> > > also netdev.
> > >
> > > Nothing in there looks like it should have any interaction with
> > > architecture, so that "it happens on sh" sounds odd, but maybe it's
> > > some particular interaction with the qemu environment.
> >
> > Yes, maybe.
> >
> > I spent few hours on this, and suspect a buggy memcpy() implementation
> > on SH, but this was not conclusive.
> >
> > By pulling one extra byte, the problem goes away.
> >
> > Strange thing is that the udhcpc process does not go past sendto().
>
> This is the patch working around the issue. Unfortunately I was not
> able to root-cause it (I really suspect something on SH)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 0824e6999e49957f7aaf7c990f6259792d42f32b..fd890a951beea03bdf24406809042666eb972655
> 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -408,11 +408,17 @@ static struct sk_buff *page_to_skb(struct
> virtnet_info *vi,
>
> /* Copy all frame if it fits skb->head, otherwise
> * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
> + *
> + * Apparently, pulling only the Ethernet Header triggers a bug
> on qemu-system-sh4.
> + * Since GRO aggregation really cares of IPv4/IPv6, pull 20 bytes
> + * more to work around this bug : These 20 bytes can not belong
> + * to UDP/TCP payload.
> + * As a bonus, this makes GRO slightly faster for IPv4 (one less copy).
> */

Question: do we still want to do this for performance reasons?
We also have the hdr_len coming from the device which is
just skb_headlen on the host.

> if (len <= skb_tailroom(skb))
> copy = len;
> else
> - copy = ETH_HLEN + metasize;
> + copy = ETH_HLEN + sizeof(struct iphdr) + metasize;
> skb_put_data(skb, p, copy);
>
> if (metasize) {