[PATCH] net: clear offline CPU backlog.state in dev_cpu_dead()
From: wangyongyong
Date: Wed Jul 23 2025 - 09:21:08 EST
From: wangyongyong <wangyongyong@xxxxxxxxxxx>
When a packet is enqueued to a remote CPU's backlog queue via enqueue_to_backlog(),
the following race condition can occur with CPU hotplug:
1. Source CPU sets NAPI_STATE_SCHED on target CPU's softnet_data->backlog.state
2. Source CPU raises NET_RX_SOFTIRQ to schedule NAPI polling
3. Target CPU is taken offline before the IPI arrives
4. dev_cpu_dead() fails to clear NAPI_STATE_SCHED because backlog isn't in poll_list
This results in:
- Stale NAPI_STATE_SCHED flag on offline CPU's backlog.state
- When the target CPU comes back online, the persistent NAPI_STATE_SCHED flag
prevents the backlog from being properly added to poll_list, causing packet
processing stalls
Signed-off-by: wangyongyong <wangyongyong@xxxxxxxxxxx>
---
net/core/dev.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index be97c440ecd5..fd92ab79c02a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -12385,6 +12385,7 @@ static int dev_cpu_dead(unsigned int oldcpu)
else
____napi_schedule(sd, napi);
}
+ oldsd->backlog.state &= NAPIF_STATE_THREADED;
raise_softirq_irqoff(NET_TX_SOFTIRQ);
local_irq_enable();
--
2.25.1