Re: 1e918876 breaks r8169 (linux-3.18+)

From: Florian Westphal
Date: Tue Feb 03 2015 - 05:42:30 EST


Tomas Szepe <szepe@xxxxxxxxxxxxxxx> wrote:
> Hi,
>
> Since linux-3.18.0, r8169 is having problems driving one of my add-on
> PCIe NICs. The interface is losing link for several seconds at a time,
> the frequency being about once a minute when the traffic is high.
>
> The first loss of link is accompanied by the message "NETDEV WATCHDOG:
> eth1 (r8169): transmit queue 0 timed out" and a call trace, while
> subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up"
> (w/o the complementary "link down" message).
>
> I've traced the culprit down to commit 1e918876, "r8169: add support
> for Byte Queue Limits" by Florian Westphal <fw@xxxxxxxxx>. Reverting
> the patch appears to fix the problem on linux-3.18.5.
> The same issue might already have been reported by Marco Berizzi here:
> http://lkml.org/lkml/2014/12/11/65

Thanks for reporting this! I'm no lkml subscriber and thus did not
see earlier report.

I'll try to reproduce this but unfortunately I am currently travelling
and won't have access to my r8169 nic until Feb 10th.

If you're willing to experiment (not even compile tested):

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -5067,8 +5067,6 @@ static void rtl_hw_reset(struct rtl8169_private *tp)
RTL_W8(ChipCmd, CmdReset);

rtl_udelay_loop_wait_low(tp, &rtl_chipcmd_cond, 100, 100);
-
- netdev_reset_queue(tp->dev);
}

static void rtl_request_uncached_firmware(struct rtl8169_private *tp)
@@ -6773,6 +6771,7 @@ static void rtl8169_tx_clear(struct rtl8169_private *tp)
{
rtl8169_tx_clear_range(tp, tp->dirty_tx, NUM_TX_DESC);
tp->cur_tx = tp->dirty_tx = 0;
+ netdev_reset_queue(tp->dev);
}

static void rtl_reset_work(struct rtl8169_private *tp)
@@ -7231,9 +7230,9 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
tx_left--;
}

- if (tp->dirty_tx != dirty_tx) {
- netdev_completed_queue(tp->dev, pkts_compl, bytes_compl);
+ netdev_completed_queue(tp->dev, pkts_compl, bytes_compl);

+ if (tp->dirty_tx != dirty_tx) {
u64_stats_update_begin(&tp->tx_stats.syncp);
tp->tx_stats.packets += pkts_compl;
tp->tx_stats.bytes += bytes_compl;
@@ -7263,6 +7262,8 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)

RTL_W8(TxPoll, NPQ);
}
+ } else {
+ WARN_ON_ONCE(pkts_compl);
}
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/