RE: [RFC Patch net-next v1 0/9] r8169: add RSS support for RTL8127

From: Javen

Date: Mon Apr 27 2026 - 03:00:47 EST


>On 20.04.2026 04:19, javen wrote:
>> From: Javen Xu <javen_xu@xxxxxxxxxxxxxx>
>>
>> This series patch adds RSS support for RTL8127 in the r8169 driver.
>>
>> Currently, without RSS support, a single CPU core handles all incoming
>> traffic. Under heavy loads, this single core becomes a bottleneck,
>> causing high softirq usage and leading to unstable and degraded network
>throughput.
>>
>> As a result, we add rss support for RTL8127. This RFC patch is just
>> for discussing. And we do some experiments on AMD platform. Below is
>> the result.
>>
>> Platform: AMD Ryzen Embedded R2514 with Radeon Graphics(4 Cores/8
>> Threads)
>
>An older embedded CPU (AFAICS from 2019, refreshed in 2022) in reality is
>unlikely to be used with sustained 10GBit traffic. It would be too weak to
>handle userspace apps making use of this high throughput. This hw edge case
>IMO isn't really an argument for adding 1.000 LoC, blowing up driver structs,
>and adding the complexity of dealing with a register layout changing every two
>chip versions.
>
>It's really a problem that Realtek frequently changes register layout and/or
>register semantics in a not backward-compatible way (and doesn't provide
>documentation), resulting in ugly versioned stuff like the following.
>
>IMR_V2_SET_REG_8125 = 0x0d0c,
>IMR_V2_CLEAR_REG_8125 = 0x0d00,
>IMR_V4_L2_CLEAR_REG_8125 = 0x0d10,
>ISR_V2_8125 = 0x0d04,
>ISR_V4_L2_8125 = 0x0d14,
>
>case RTL_GIGA_MAC_VER_80:
> tp->HwSuppIsrVer = 6;
>default:
> tp->HwSuppIsrVer = 1;
>
>This messy hw design makes it hard to develop maintainable drivers.
>This is underlined by the fact that Realtek has separate r8125, r8126,
>r8127 drivers, even though they share most of the code.
>
>> Arch: x86_64
>> Test command:
>> Server: iperf3 -s
>> Client: iperf3 -c 192.168.2.1 -P 20 -t 3600
>> Monitor: mpstat -P ALL 1
>>
>> Before this patch (Without RSS):
>> Throughput: Unstable, fluctuating between 3.76 Gbits/sec and
>> 8.2 Gbits/sec.
>> CPU Usage: A single CPU core is fully occupied with softirq reaching
>> up to 96%.
>>
>> After this patch (With RSS enabled):
>> Throughput: Stable at 9.42 Gbits/sec.
>> CPU Usage: The traffic load is evenly distributed across multiple CPU
>> cores. The maximum softirq on a single core dropped to 63%.
>>
>> Patch summary:
>> Patch 1: Adds necessary macro and register definitions for RSS.
>> Patch 2-4: Support NAPI and multi RX/TX queues.
>
>Driver supports NAPI already.
>
>> Patch 5-6: Support MSI-X and enables it specifically for RTL8127.
>
>Also MSI-X is used already.
>
>> Patch 7: Enables RSS for RTL8127.
>> Patch 8-9: Adds ethtool support to configure the number of RX queues.
>>
>> Javen Xu (9):
>> r8169: add some register definitions
>> r8169: add napi and irq support
>> r8169: add support for multi tx queues
>> r8169: add support for multi rx queues
>> r8169: add support for msix
>> r8169: enable msix for RTL8127
>> r8169: add support and enable rss
>> r8169: move struct ethtool_ops
>> r8169: add support for ethtool
>>
>> drivers/net/ethernet/realtek/r8169_main.c | 1437
>> ++++++++++++++++++---
>> 1 file changed, 1238 insertions(+), 199 deletions(-)
>>
>
>Series includes functions like rtl8169_desc_quirk() indicating a need to work
>around hw errata. Would be helpful to add comments describing the hw
>erratum, best with a link to documentation.

This is a workaround for a hardware erratum on RTL8127.
The hardware cannot guarantee that the descriptor OwnBit is fully written to host memory before interrupt is triggered. If the CPU handles the interrupt very quickly, it might read stale descriptor data where DescOwn is still set, causing it to incorrectly skip the packet.
The recheck_desc_ownbit flag and the subsequent rtl8127_desc_quirk() are introduced to wait for the descriptor write to complete and check it one last time.


Thanks for your review and suggestions.

Summary of changes in upcoming v2:
- remove multi tx queue patch
- rename some macro definitions, such as RXS_8125B_RSS_UDP_V4
- convert enum rtl8127_rss_register_content to #define and use BIT() macro
- run checkpatch, explain the usage of dma_wmb() etc.
- fix typo errors (e.g., DEAFULT)

BRs,
Javen