Re: [PATCH 1/11] Add generic helpers for arch IPI function calls

From: Mark Lord
Date: Thu Apr 24 2008 - 08:44:28 EST


Jens Axboe wrote:
On Wed, Apr 23 2008, Mark Lord wrote:
Jens Axboe wrote:
On Wed, Apr 23 2008, Mark Lord wrote:
..
The second bug, is that for the halt case at least,
nobody waits for the other CPU to actually halt
before continuing.. so we sometimes enter the shutdown
code while other CPUs are still active.

This causes some machines to hang at shutdown,
unless CPU_HOTPLUG is configured and takes them offline
before we get here.
I'm guessing there's a reason it doesn't pass '1' as the last argument,
because that would fix that issue?
..

Undoubtedly -- perhaps the called CPU halts, and therefore cannot reply. :)

Uhm yes, I guess stop_this_cpu() does exactly what the name implies :-)

But some kind of pre-halt ack, perhaps plus a short delay by the caller
after receipt of the ack, would probably suffice to kill that bug.

But I really haven't studied this code enough to know,
other than that it historically has been a sticky area
to poke around in.

Something like this will close the window to right up until the point
where the other CPUs have 'almost' called halt().

diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 5398385..94ec9bf 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -155,8 +155,9 @@ static void stop_this_cpu(void *dummy)
/*
* Remove this CPU:
*/
- cpu_clear(smp_processor_id(), cpu_online_map);
disable_local_APIC();
+ cpu_clear(smp_processor_id(), cpu_online_map);
+ smp_wmb();
if (hlt_works(smp_processor_id()))
for (;;) halt();
for (;;);
@@ -175,6 +176,12 @@ static void native_smp_send_stop(void)
local_irq_save(flags);
smp_call_function(stop_this_cpu, NULL, 0, 0);
+
+ while (cpus_weight(cpu_online_map) > 1) {
+ cpu_relax();
+ smp_rmb();
+ }
+
disable_local_APIC();
local_irq_restore(flags);
}
..

Yup, that looks like it oughta work consistently.
Now we just need to hear from some of the folks who
have danced around this code in the past.

(added Pavel & Rafael to Cc:).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/