[PATCH 3.12 75/88] net/qlge: Avoids recursive EEH error

From: Jiri Slaby
Date: Thu Jul 14 2016 - 04:21:25 EST


From: Gavin Shan <gwshan@xxxxxxxxxxxxxxxxxx>

3.12-stable review patch. If anyone has any objections, please let me know.

===============

commit 3275c0c6c522ab04afa14f80efdac6213c3883d6 upstream.

One timer, whose handler keeps reading on MMIO register for EEH
core to detect error in time, is started when the PCI device driver
is loaded. MMIO register can't be accessed during PE reset in EEH
recovery. Otherwise, the unexpected recursive error is triggered.
The timer isn't closed that time if the interface isn't brought
up. So the unexpected recursive error is seen during EEH recovery
when the interface is down.

This avoids the unexpected recursive EEH error by closing the timer
in qlge_io_error_detected() before EEH PE reset unconditionally. The
timer is started unconditionally after EEH PE reset in qlge_io_resume().
Also, the timer should be closed unconditionally when the device is
removed from the system permanently in qlge_io_error_detected().

Reported-by: Shriya R. Kulkarni <shriyakul@xxxxxxxxxx>
Signed-off-by: Gavin Shan <gwshan@xxxxxxxxxxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Jiri Slaby <jslaby@xxxxxxx>
---
drivers/net/ethernet/qlogic/qlge/qlge_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_main.c b/drivers/net/ethernet/qlogic/qlge/qlge_main.c
index 151478a59e30..a8cbeed1968a 100644
--- a/drivers/net/ethernet/qlogic/qlge/qlge_main.c
+++ b/drivers/net/ethernet/qlogic/qlge/qlge_main.c
@@ -4780,7 +4780,6 @@ static void ql_eeh_close(struct net_device *ndev)
}

/* Disabling the timer */
- del_timer_sync(&qdev->timer);
ql_cancel_all_work_sync(qdev);

for (i = 0; i < qdev->rss_ring_count; i++)
@@ -4807,6 +4806,7 @@ static pci_ers_result_t qlge_io_error_detected(struct pci_dev *pdev,
return PCI_ERS_RESULT_CAN_RECOVER;
case pci_channel_io_frozen:
netif_device_detach(ndev);
+ del_timer_sync(&qdev->timer);
if (netif_running(ndev))
ql_eeh_close(ndev);
pci_disable_device(pdev);
@@ -4814,6 +4814,7 @@ static pci_ers_result_t qlge_io_error_detected(struct pci_dev *pdev,
case pci_channel_io_perm_failure:
dev_err(&pdev->dev,
"%s: pci_channel_io_perm_failure.\n", __func__);
+ del_timer_sync(&qdev->timer);
ql_eeh_close(ndev);
set_bit(QL_EEH_FATAL, &qdev->flags);
return PCI_ERS_RESULT_DISCONNECT;
--
2.9.1