Re: [PATCH net-next] xen-netfront: avoid packet loss when ethernet header crosses page boundary

From: David Vrabel
Date: Fri Sep 09 2016 - 09:41:54 EST


On 22/08/16 16:42, Vitaly Kuznetsov wrote:
> Small packet loss is reported on complex multi host network configurations
> including tunnels, NAT, ... My investigation led me to the following check
> in netback which drops packets:
>
> if (unlikely(txreq.size < ETH_HLEN)) {
> netdev_err(queue->vif->dev,
> "Bad packet size: %d\n", txreq.size);
> xenvif_tx_err(queue, &txreq, extra_count, idx);
> break;
> }
>
> But this check itself is legitimate. SKBs consist of a linear part (which
> has to have the ethernet header) and (optionally) a number of frags.
> Netfront transmits the head of the linear part up to the page boundary
> as the first request and all the rest becomes frags so when we're
> reconstructing the SKB in netback we can't distinguish between original
> frags and the 'tail' of the linear part. The first SKB needs to be at
> least ETH_HLEN size. So in case we have an SKB with its linear part
> starting too close to the page boundary the packet is lost.
>
> I see two ways to fix the issue:
> - Change the 'wire' protocol between netfront and netback to start keeping
> the original SKB structure. We'll have to add a flag indicating the fact
> that the particular request is a part of the original linear part and not
> a frag. We'll need to know the length of the linear part to pre-allocate
> memory.
> - Avoid transmitting SKBs with linear parts starting too close to the page
> boundary. That seems preferable short-term and shouldn't bring
> significant performance degradation as such packets are rare. That's what
> this patch is trying to achieve with skb_copy().
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>

We should probably fix the backend to handle this (by grant copying a
minimum amount in the linear area, but since netfront needs to work with
older netback.

Acked-by: David Vrabel <david.vrabel@xxxxxxxxxx>

David