PATCH: Fix race in cpu_down (hotplug cpu)

From: Nigel Cunningham
Date: Sun Sep 18 2005 - 22:29:49 EST


Hmm... managed to miss a word at the end of the first para and thus not
make sense. Let's try again.

----------

Hi.

There is a race condition in taking down a cpu (kernel/cpu.c::cpu_down).
A cpu can already be idling when we clear its online flag, and we do not
force the idle task to reschedule. This results in __cpu_die timing out.
A simple fix is to force the idle task on the cpu going down to reschedule.

Without the patch below, Suspend2 get into a deadlock at resume time
when this issue occurs. I could not complete 20 cycles without seeing
the issue. With the patch below, I have completed 75 cycles on the trot
without problems.

Please apply.

Signed-off-by: Nigel Cunningham <ncunningham@xxxxxxxxxxxx>

diff -ruNp 9910-hotplug-cpu-race.patch-old/kernel/cpu.c 9910-hotplug-cpu-race.patch-new/kernel/cpu.c
--- 9910-hotplug-cpu-race.patch-old/kernel/cpu.c 2005-08-29 10:29:58.000000000 +1000
+++ 9910-hotplug-cpu-race.patch-new/kernel/cpu.c 2005-09-19 12:15:08.000000000 +1000
@@ -126,6 +126,9 @@ int cpu_down(unsigned int cpu)
while (!idle_cpu(cpu))
yield();

+ /* CPU may have idled before we set its offline flag. */
+ set_tsk_need_resched(idle_task(cpu));
+
/* This actually kills the CPU. */
__cpu_die(cpu);




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/