[PATCH net v3 12/12] net/sonic: Prevent tx watchdog timeout

From: Finn Thain
Date: Wed Jan 22 2020 - 17:24:13 EST


Section 5.5.3.2 of the datasheet says,

If FIFO Underrun, Byte Count Mismatch, Excessive Collision, or
Excessive Deferral (if enabled) errors occur, transmission ceases.

In this situation, the chip asserts a TXER interrupt rather than TXDN.
But the handler for the TXDN is the only way that the transmit queue
gets restarted. Hence, an aborted transmission can result in a watchdog
timeout.

This problem can be reproduced on congested link, as that can result in
excessive transmitter collisions. Another way to reproduce this is with
a FIFO Underrun, which may be caused by DMA latency.

In event of a TXER interrupt, prevent a watchdog timeout by restarting
transmission.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@xxxxxxxxx>
Signed-off-by: Finn Thain <fthain@xxxxxxxxxxxxxxxxxxx>
---
drivers/net/ethernet/natsemi/sonic.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/natsemi/sonic.c b/drivers/net/ethernet/natsemi/sonic.c
index 27b6f6585527..05e760444a92 100644
--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -415,10 +415,19 @@ static irqreturn_t sonic_interrupt(int irq, void *dev_id)
lp->stats.rx_missed_errors += 65536;

/* transmit error */
- if (status & SONIC_INT_TXER)
- if (SONIC_READ(SONIC_TCR) & SONIC_TCR_FU)
- netif_dbg(lp, tx_err, dev, "%s: tx fifo underrun\n",
- __func__);
+ if (status & SONIC_INT_TXER) {
+ u16 tcr = SONIC_READ(SONIC_TCR);
+
+ netif_dbg(lp, tx_err, dev, "%s: TXER intr, TCR %04x\n",
+ __func__, tcr);
+
+ if (tcr & (SONIC_TCR_EXD | SONIC_TCR_EXC |
+ SONIC_TCR_FU | SONIC_TCR_BCM)) {
+ /* Aborted transmission. Try again. */
+ netif_stop_queue(dev);
+ SONIC_WRITE(SONIC_CMD, SONIC_CR_TXP);
+ }
+ }

/* bus retry */
if (status & SONIC_INT_BR) {
--
2.24.1