Re: [PATCH net-next 2/4] atl1c: improve performance by avoiding unnecessary pcie writes on xmit

From: Gatis Peisenieks
Date: Wed May 12 2021 - 10:35:54 EST




On 2021-05-12 05:39, Chris Snook wrote:
Increases in latency tend to hurt more on single-queue devices. Has
this been tested on the original gigabit atl1c?

Thank you Chris, for checking this out!

I did test the atl1c driver with and without this change on actual
AR8151 hardware.

My test system was Intel(R) Core(TM) i7-4790K + RB44Ge.
That is a 4 port 1G AR8151 based card.

I measured latency with external traffic generator with test system
doing L2 forwarding. Receiving traffic on one atl1c interface and
transmitting over another atl1c interface. I had default 1000 packet
pfifo queue configured on the atl1c interfaces.

Max 64 byte packet L2 forward pps rate improved 860K -> 1070K.

Any latency difference at 800Kpps was lost in the noise - with the
particular traffic generator system (a linux based RouterOS traffic-gen).
I measured average 285us for a 30 second run in both cases. Note that
this includes any traffic generator "internal" latency.

Note that I had to tweak atl1c tx interrupt moderation to get these
numbers. With default tx_imt = 1000 no matter what I get only 500
tx interrupts/sec. Since the tx clean is fast and do not get polled
repeatedly and ring size is 1024 I am limited to ~500Kpps.
tx_imt = 500 dobubles that, I used tx_imt = 200 for this test.

As a side note that still relates to latency discussion on AR8151
hardware what I did find out however is that rx interrupt moderation
timer value has a big effect on latency. Changing rx_imt
from 200 to 10 resulted in considerable improvement from 285us to 41us
average latency as measured by traffic generator. I do not have
enough knowledge of the quirks of all the hardware supported by
the driver to confidently put this in a patch though.

Mikrotik 10/25G NIC has its own interrupt moderation mechanism,
so this is not relevant to that if anyone is interested.



- Chris

On Tue, May 11, 2021 at 12:05 PM Gatis Peisenieks <gatis@xxxxxxxxxxxx> wrote:

The kernel has xmit_more facility that hints the networking driver xmit
path about whether more packets are coming soon. This information can be
used to avoid unnecessary expensive PCIe transaction per tx packet at a
slight increase in latency.

Max TX pps on Mikrotik 10/25G NIC in a Threadripper 3960X system
improved from 1150Kpps to 1700Kpps.