RE: [PATCH] net: fec: Detect and recover receive queue hangs

From: Andy Duan
Date: Fri Nov 18 2016 - 03:19:38 EST


From: Chris Lesiak <chris.lesiak@xxxxxxxxx> Sent: Friday, November 18, 2016 5:15 AM
>To: Andy Duan <fugang.duan@xxxxxxx>
>Cc: netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Jaccon
>Bastiaansen <jaccon.bastiaansen@xxxxxxxxx>; chris.lesiak@xxxxxxxxx
>Subject: [PATCH] net: fec: Detect and recover receive queue hangs
>
>This corrects a problem that appears to be similar to ERR006358. But while
>ERR006358 is a race when the tx queue transitions from empty to not empty,
>this problem is a race when the rx queue transitions from full to not full.
>
>The symptom is a receive queue that is stuck. The ENET_RDAR register will
>read 0, indicating that there are no empty receive descriptors in the receive
>ring. Since no additional frames can be queued, no RXF interrupts occur.
>
>This problem can be triggered with a 1 Gb link and about 400 Mbps of traffic.
>
>This patch detects this condition, sets the work_rx bit, and reschedules the
>poll method.
>
>Signed-off-by: Chris Lesiak <chris.lesiak@xxxxxxxxx>
>---
> drivers/net/ethernet/freescale/fec_main.c | 31
>+++++++++++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
>
Firstly, how to reproduce the issue, pls list the reproduce steps. Thanks.
Secondly, pls check below comments.

>diff --git a/drivers/net/ethernet/freescale/fec_main.c
>b/drivers/net/ethernet/freescale/fec_main.c
>index fea0f33..8a87037 100644
>--- a/drivers/net/ethernet/freescale/fec_main.c
>+++ b/drivers/net/ethernet/freescale/fec_main.c
>@@ -1588,6 +1588,34 @@ fec_enet_interrupt(int irq, void *dev_id)
> return ret;
> }
>
>+static inline bool
>+fec_enet_recover_rxq(struct fec_enet_private *fep, u16 queue_id) {
>+ int work_bit = (queue_id == 0) ? 2 : ((queue_id == 1) ? 0 : 1);
>+
>+ if (readl(fep->rx_queue[queue_id]->bd.reg_desc_active))
If rx ring is really empty in slight throughput cases, rdar is always cleared, then there always do napi reschedule.

>+ return false;
>+
>+ dev_notice_once(&fep->pdev->dev, "Recovered rx queue\n");
>+
>+ fep->work_rx |= 1 << work_bit;
>+
>+ return true;
>+}
>+
>+static inline bool fec_enet_recover_rxqs(struct fec_enet_private *fep)
>+{
>+ unsigned int q;
>+ bool ret = false;
>+
>+ for (q = 0; q < fep->num_rx_queues; q++) {
>+ if (fec_enet_recover_rxq(fep, q))
>+ ret = true;
>+ }
>+
>+ return ret;
>+}
>+
> static int fec_enet_rx_napi(struct napi_struct *napi, int budget) {
> struct net_device *ndev = napi->dev;
>@@ -1601,6 +1629,9 @@ static int fec_enet_rx_napi(struct napi_struct *napi,
>int budget)
> if (pkts < budget) {
> napi_complete(napi);
> writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK);
>+
>+ if (fec_enet_recover_rxqs(fep) && napi_reschedule(napi))
>+ writel(FEC_NAPI_IMASK, fep->hwp + FEC_IMASK);
> }
> return pkts;
> }
>--
>2.5.5