Re: [PATCH net-next 2/2] net: core: increase the default size of GRO_NORMAL skb lists to flush

From: Alexander Lobakin
Date: Sat Oct 12 2019 - 07:54:02 EST


Hi Eric,

Eric Dumazet wrote 12.10.2019 14:18:
On Sat, Oct 12, 2019 at 2:22 AM Alexander Lobakin <alobakin@xxxxxxxx> wrote:


I've generated an another solution. Considering that gro_normal_batch
is very individual for every single case, maybe it would be better to
make it per-NAPI (or per-netdevice) variable rather than a global
across the kernel?
I think most of all network-capable configurations and systems has more
than one network device nowadays, and they might need different values
for achieving their bests.

One possible variant is:

#define THIS_DRIVER_GRO_NORMAL_BATCH 16

/* ... */

netif_napi_add(dev, napi, this_driver_rx_poll, NAPI_POLL_WEIGHT); /*
napi->gro_normal_batch will be set to the systcl value during NAPI
context initialization */
napi_set_gro_normal_batch(napi, THIS_DRIVER_GRO_NORMAL_BATCH); /* new
static inline helper, napi->gro_normal_batch will be set to the
driver-speficic value of 16 */

The second possible variant is to make gro_normal_batch sysctl
per-netdevice to tune it from userspace.
Or we can combine them into one to make it available for tweaking from
both driver and userspace, just like it's now with XPS CPUs setting.

If you'll find any of this reasonable and worth implementing, I'll come
with it in v2 after a proper testing.

Most likely the optimal tuning is also a function of the host cpu caches.

Building a too big list can also lead to premature cache evictions.

Tuning the value on your test machines does not mean the value will be good
for other systems.

Oh, I missed that it might be a lot more machine-dependent than
netdevice-dependent. Thank you for explanation. The best I can do in
that case is to leave batch control in its current.
I'll publish v2 containing only the acked first part of the series on
Monday if nothing serious will happen. Addition of listified Rx to
napi_gro_receive() was the main goal anyway.


Adding yet another per device value should only be done if you demonstrate
a significant performance increase compared to the conservative value
Edward chose.

Also the behavior can be quite different depending on the protocols,
make sure you test handling of TCP pure ACK packets.

Accumulating 64 (in case the device uses standard NAPI_POLL_WEIGHT)
of them before entering upper stacks seems not a good choice, since 64 skbs
will need to be kept in the GRO system, compared to only 8 with Edward value.

Regards,
á á á á á á