[PATCH v2] igb: Fix watchdog_task race with shutdown
From: Ian Ray
Date: Tue Jun 03 2025 - 04:10:20 EST
A rare [1] race condition is observed between the igb_watchdog_task and
shutdown on a dual-core i.MX6 based system with two I210 controllers.
Using printk, the igb_watchdog_task is hung in igb_read_phy_reg because
__igb_shutdown has already called __igb_close.
The fix is to delete timer and cancel the work after settting IGB_DOWN.
This approach mirrors igb_up.
reboot kworker
__igb_shutdown
rtnl_lock
__igb_close
: igb_watchdog_task
: :
: igb_read_phy_reg (hung)
rtnl_unlock
[1] Note that this is easier to reproduce with 'initcall_debug' logging
and additional and printk logging in igb_main.
Signed-off-by: Ian Ray <ian.ray@xxxxxxxxxxxxxxxx>
---
Changes in v2:
- Change strategy to avoid taking RTNL.
- Link to v1: https://lore.kernel.org/all/20250428115450.639-1-ian.ray@xxxxxxxxxxxxxxxx/
---
drivers/net/ethernet/intel/igb/igb_main.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 9e9a5900e6e5..a65ae7925ae8 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2175,10 +2175,14 @@ void igb_down(struct igb_adapter *adapter)
u32 tctl, rctl;
int i;
- /* signal that we're down so the interrupt handler does not
- * reschedule our watchdog timer
+ /* The watchdog timer may be rescheduled, so explicitly
+ * disable watchdog from being rescheduled.
*/
set_bit(__IGB_DOWN, &adapter->state);
+ timer_delete_sync(&adapter->watchdog_timer);
+ timer_delete_sync(&adapter->phy_info_timer);
+
+ cancel_work_sync(&adapter->watchdog_task);
/* disable receives in the hardware */
rctl = rd32(E1000_RCTL);
@@ -2210,9 +2214,6 @@ void igb_down(struct igb_adapter *adapter)
}
}
- timer_delete_sync(&adapter->watchdog_timer);
- timer_delete_sync(&adapter->phy_info_timer);
-
/* record the stats before reset*/
spin_lock(&adapter->stats64_lock);
igb_update_stats(adapter);
--
2.49.0