[PATCH net-next 0/1] net: stmmac: add per-q coalesce support

From: Ong Boon Leong
Date: Mon Mar 15 2021 - 02:42:24 EST


Hi,

This patch adds per-queue RX & TX coalesce control so that user can
adjust the RX & TX interrupt moderation per queue. This is beneficial for
mixed criticality control (according to VLAN priority) by user application.

The patch as been tested with following steps and results and the
from the output of ethtool, it looks good.

########################################################################

> ethtool --show-coalesce eth0
Coalesce parameters for eth0:
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

> ethtool --per-queue eth0 queue_mask 0xFF --show-coalesce
Queue: 0
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 1
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 2
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 3
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 4
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 5
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 6
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 7
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

> ethtool --per-queue eth0 queue_mask 0x02 --coalesce rx-usecs 100 rx-frames 5
> ethtool --per-queue eth0 queue_mask 0x20 --coalesce rx-usecs 100 rx-frames 5
> ethtool --per-queue eth0 queue_mask 0x22 --show-coalesce
Queue: 1
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 99
rx-frames: 5
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 5
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 99
rx-frames: 5
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

> ethtool --per-queue eth0 queue_mask 0x04 --coalesce tx-usecs 156 tx-frames 26
> ethtool --per-queue eth0 queue_mask 0x40 --coalesce tx-usecs 156 tx-frames 26
> ethtool --per-queue eth0 queue_mask 0x44 --show-coalesce
Queue: 2
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 200
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 156
tx-frames: 26
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 6
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 200
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 156
tx-frames: 26
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

> ethtool --per-queue eth0 queue_mask 0xFF --coalesce rx-usecs 204 rx-frames 0
rx-frames unmodified, ignoring
rx-frames unmodified, ignoring
rx-frames unmodified, ignoring
rx-frames unmodified, ignoring
rx-frames unmodified, ignoring
rx-frames unmodified, ignoring
> ethtool --per-queue eth0 queue_mask 0xFF --coalesce tx-usecs 1000 tx-frames 25
tx-usecs unmodified, ignoring
tx-frames unmodified, ignoring
Queue 0, no coalesce parameters changed
tx-usecs unmodified, ignoring
tx-frames unmodified, ignoring
Queue 1, no coalesce parameters changed
tx-usecs unmodified, ignoring
tx-frames unmodified, ignoring
Queue 3, no coalesce parameters changed
tx-usecs unmodified, ignoring
tx-frames unmodified, ignoring
Queue 4, no coalesce parameters changed
tx-usecs unmodified, ignoring
tx-frames unmodified, ignoring
Queue 5, no coalesce parameters changed
tx-usecs unmodified, ignoring
tx-frames unmodified, ignoring
Queue 7, no coalesce parameters changed
> ethtool --show-coalesce eth0
Coalesce parameters for eth0:
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

> ethtool --per-queue eth0 queue_mask 0xFF --show-coalesce
Queue: 0
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 1
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 2
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 3
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 4
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 5
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 6
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

Queue: 7
Adaptive RX: off TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 202
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 1000
tx-frames: 25
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frames-low: 0
tx-usecs-low: 0
tx-frames-low: 0

rx-usecs-high: 0
rx-frames-high: 0
tx-usecs-high: 0
tx-frames-high: 0

########################################################################

Thanks,
Boon Leong

Ong Boon Leong (1):
net: stmmac: add per-queue TX & RX coalesce ethtool support

.../ethernet/stmicro/stmmac/dwmac1000_dma.c | 2 +-
.../net/ethernet/stmicro/stmmac/dwmac4_dma.c | 7 +-
.../ethernet/stmicro/stmmac/dwxgmac2_dma.c | 7 +-
drivers/net/ethernet/stmicro/stmmac/hwif.h | 2 +-
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 8 +-
.../ethernet/stmicro/stmmac/stmmac_ethtool.c | 132 ++++++++++++++++--
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 48 ++++---
7 files changed, 157 insertions(+), 49 deletions(-)

--
2.25.1