[PATCH] stop_machine() vs. synchronous IPI send deadlock

From: Kirill Korotaev
Date: Wed Nov 09 2005 - 06:07:46 EST


Hello Andrew,

This patch fixes deadlock of stop_machine() vs. synchronous IPI send.
The problem is that stop_machine() disables interrupts before disabling preemption on other CPUs. So if another CPU is preempted and then calls something like flush_tlb_all() it will deadlock with CPU doing stop_machine() and which can't process IPI due to disabled IRQs.

I changed stop_machine() to do the same things exactly as it does on other CPUs, i.e. it should disable preemption first on _all_ CPUs including itself and only after that disable IRQs.

Signed-Off-By: Kirill Korotaev <dev@xxxxx>

Kirill --- ./kernel/stop_machine.c.stpmach 2005-11-01 12:06:03.000000000 +0300
+++ ./kernel/stop_machine.c 2005-11-09 13:58:03.000000000 +0300
@@ -114,13 +114,12 @@ static int stop_machine(void)
return ret;
}

- /* Don't schedule us away at this point, please. */
- local_irq_disable();
-
/* Now they are all started, make them hold the CPUs, ready. */
+ preempt_disable();
stopmachine_set_state(STOPMACHINE_PREPARE);

/* Make them disable irqs. */
+ local_irq_disable();
stopmachine_set_state(STOPMACHINE_DISABLE_IRQ);

return 0;