Re: [PATCH] Do not force shutdown/reboot to boot cpu.

From: Robin Holt
Date: Wed Apr 10 2013 - 10:01:49 EST


On Wed, Apr 10, 2013 at 01:16:20PM +0200, Ingo Molnar wrote:
>
> * Robin Holt <holt@xxxxxxx> wrote:
>
> > On Mon, Apr 08, 2013 at 09:11:06AM -0700, H. Peter Anvin wrote:
> > > On 04/08/2013 08:57 AM, Ingo Molnar wrote:
> > > >
> > > > I think the original commit:
> > > >
> > > > f96972f2dc63 kernel/sys.c: call disable_nonboot_cpus() in kernel_restart()
> > > >
> > > > actually regressed your 1024 CPU systems, and should possibly be reverted or fixed
> > > > in some other fashion - such as by migrating to the primary CPU (on architectures
> > > > that require that), instead of hotplug offlining every secondary CPU on every
> > > > architecture!
> > > >
> > > > Alternatively, disable_nonboot_cpus() could perhaps be improved to down CPUs in
> > > > parallel: issue the CPU-down requests to every CPU, then wait for them to complete
> > > > - instead of the loop over every CPU?
> > > >
> > > > This would be the conceptual counter part to parallel boot up of CPUs - something
> > > > SGI might be interested in as well?
> > > >
> > >
> > > Migrating to the boot processor and then calling stop_machine() to
> > > defang any other processors should be sufficient, no?
> > >
> > > I don't know if there is any reason to deschedule all tasks?
> >
> > My reading of the original commit indicated that some architecture's
> > firmware needs the boot cpu to be the one initiating reboot.
> >
> > If that is correct, then I can not see why a stop_machine() implementation
> > will not work.
> >
> > Since this is in generic kernel code, how can I proceed?
>
> I think rebooting on the same CPU where we booted up is something worth having in
> general, as a firmware robustness feature. (assuming the CPU in question is still
> online)
>
> We have similar constraints in the suspend code for example - some x86 firmware
> breaks if suspend related ACPI calls are not done on the boot CPU ...
>
> So how about restoring the old "just reboot, don't shut down the others" behavior,
> extended with a "reboot on the CPU that booted up" reboot affinity logic?

Just want to be sure I am going the write direction, but in the shutdown and
reboot case, you would support something like:

diff --git a/kernel/sys.c b/kernel/sys.c
index 39c9c4a..35845c5 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -358,6 +358,18 @@ int unregister_reboot_notifier(struct notifier_block *nb)
}
EXPORT_SYMBOL(unregister_reboot_notifier);

+void migrate_to_boot_cpu(void)
+{
+ cpumask_t *shutdown_cpu_mask;
+
+ shutdown_cpu_mask = kzalloc(sizeof(cpumask_t), GFP_KERNEL);
+ if (shutdown_cpu_mask) {
+ cpumask_set_cpu(0, shutdown_cpu_mask);
+ cpumask_and(shutdown_cpu_mask, shutdown_cpu_mask, cpu_online_mask);
+ set_cpus_allowed_ptr(current, shutdown_cpu_mask);
+ }
+}
+
/**
* kernel_restart - reboot the system
* @cmd: pointer to buffer containing command to execute for restart
@@ -369,7 +381,7 @@ EXPORT_SYMBOL(unregister_reboot_notifier);
void kernel_restart(char *cmd)
{
kernel_restart_prepare(cmd);
- disable_nonboot_cpus();
+ migrate_to_boot_cpu();
if (!cmd)
printk(KERN_EMERG "Restarting system.\n");
else
@@ -413,7 +425,7 @@ void kernel_power_off(void)
kernel_shutdown_prepare(SYSTEM_POWER_OFF);
if (pm_power_off_prepare)
pm_power_off_prepare();
- disable_nonboot_cpus();
+ migrate_to_boot_cpu();
syscore_shutdown();
printk(KERN_EMERG "Power down.\n");
kmsg_dump(KMSG_DUMP_POWEROFF);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/