Re: [RFC] net: add TCP fraglist GRO support

From: Felix Fietkau
Date: Tue Apr 23 2024 - 06:25:53 EST


On 23.04.24 12:15, Eric Dumazet wrote:
On Tue, Apr 23, 2024 at 11:41 AM Felix Fietkau <nbd@xxxxxxxx> wrote:

When forwarding TCP after GRO, software segmentation is very expensive,
especially when the checksum needs to be recalculated.
One case where that's currently unavoidable is when routing packets over
PPPoE. Performance improves significantly when using fraglist GRO
implemented in the same way as for UDP.

Here's a measurement of running 2 TCP streams through a MediaTek MT7622
device (2-core Cortex-A53), which runs NAT with flow offload enabled from
one ethernet port to PPPoE on another ethernet port + cake qdisc set to
1Gbps.

rx-gro-list off: 630 Mbit/s, CPU 35% idle
rx-gro-list on: 770 Mbit/s, CPU 40% idle

Hi Felix

changelog is a bit terse, and patch complex.

Could you elaborate why this issue
seems to be related to a specific driver ?

I think we should push hard to not use frag_list in drivers :/

And GRO itself could avoid building frag_list skbs
in hosts where forwarding is enabled.

(Note that we also can increase MAX_SKB_FRAGS to 45 these days)

The issue is not related to a specific driver at all. Here's how traffic flows: TCP packets are received on the SoC ethernet driver, the network stack performs regular GRO. The packet gets forwarded by flow offloading until it reaches the PPPoE device. PPPoE does not support GSO packets, so the packets need to be segmented again.
This is *very* expensive, since data needs to be copied and checksummed.

So in my patch, I changed the code to build fraglist GRO instead of regular GRO packets, whenever there is no local socket to receive the packets. This makes segmenting very cheap, since the original skbs are preserved on the trip through the stack. The only cost is an extra socket lookup whenever NETIF_F_FRAGLIST_GRO is enabled.

PPPoE in this case is only an example. The same issue appears when forwarding to any netdev which does not support TSO, which in my case affects most wifi drivers as well.

- Felix