Re: [PATCH RFC] reboot: hotplug cpus in migrate_to_reboot_cpu()

From: Hsin-Yi Wang
Date: Thu Oct 03 2019 - 00:50:57 EST


On Wed, Oct 2, 2019 at 7:41 PM Hsin-Yi Wang <hsinyi@xxxxxxxxxxxx> wrote:
>
> Currently system reboots use arch specific codes (eg. smp_send_stop) to
> offline non reboot cpus. Some arch like arm64, arm, and x86... set offline
> masks to cpu without really offlining them. Thus it causes some race
> condition and kernel warning comes out sometimes when system reboots. We
> can do cpu hotplug in migrate_to_reboot_cpu() to avoid this issue.
>
> Signed-off-by: Hsin-Yi Wang <hsinyi@xxxxxxxxxxxx>
> ---
> kernel warnings at reboot:
> [1] https://lore.kernel.org/lkml/20190820100843.3028-1-hsinyi@xxxxxxxxxxxx/
> [2] https://lore.kernel.org/lkml/20190727164450.GA11726@xxxxxxxxxxxx/
> ---
> kernel/cpu.c | 35 +++++++++++++++++++++++++++++++++++
> kernel/reboot.c | 18 ------------------
> 2 files changed, 35 insertions(+), 18 deletions(-)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index fc28e17940e0..2f4d51fe91e3 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -31,6 +31,7 @@
> #include <linux/relay.h>
> #include <linux/slab.h>
> #include <linux/percpu-rwsem.h>
> +#include <linux/reboot.h>
>
> #include <trace/events/power.h>
> #define CREATE_TRACE_POINTS
> @@ -1366,6 +1367,40 @@ int __boot_cpu_id;
>
> #endif /* CONFIG_SMP */
>
> +void migrate_to_reboot_cpu(void)
> +{
> + /* The boot cpu is always logical cpu 0 */
> + int cpu = reboot_cpu;
> +
> + /* Make certain the cpu I'm about to reboot on is online */
> + if (!cpu_online(cpu))
> + cpu = cpumask_first(cpu_online_mask);
> +
> + /* Prevent races with other tasks migrating this task */
> + current->flags |= PF_NO_SETAFFINITY;
> +
> + /* Make certain I only run on the appropriate processor */
> + set_cpus_allowed_ptr(current, cpumask_of(cpu));
> +
> + /* Hotplug other cpus if possible */
> + if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {

Should use #ifdef CONFIG_HOTPLUG_CPU here. Will fix in the next
version if this patch is reasonable.
(Reported-by: kbuild test robot <lkp@xxxxxxxxx>)
> + int i, err;
> +
> + cpu_maps_update_begin();
> +
> + for_each_online_cpu(i) {
> + if (i == cpu)
> + continue;
> + err = _cpu_down(i, 0, CPUHP_OFFLINE);
> + if (err)
> + pr_info("Failed to offline cpu %d\n", i);
> + }
> + cpu_hotplug_disabled++;
> +
> + cpu_maps_update_done();
> + }
> +}
> +
> /* Boot processor state steps */
> static struct cpuhp_step cpuhp_hp_states[] = {
> [CPUHP_OFFLINE] = {
> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index c4d472b7f1b4..f0046be34a60 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -215,24 +215,6 @@ void do_kernel_restart(char *cmd)
> atomic_notifier_call_chain(&restart_handler_list, reboot_mode, cmd);
> }
>
> -void migrate_to_reboot_cpu(void)
> -{
> - /* The boot cpu is always logical cpu 0 */
> - int cpu = reboot_cpu;
> -
> - cpu_hotplug_disable();
> -
> - /* Make certain the cpu I'm about to reboot on is online */
> - if (!cpu_online(cpu))
> - cpu = cpumask_first(cpu_online_mask);
> -
> - /* Prevent races with other tasks migrating this task */
> - current->flags |= PF_NO_SETAFFINITY;
> -
> - /* Make certain I only run on the appropriate processor */
> - set_cpus_allowed_ptr(current, cpumask_of(cpu));
> -}
> -
> /**
> * kernel_restart - reboot the system
> * @cmd: pointer to buffer containing command to execute for restart
> --
> 2.23.0.444.g18eeb5a265-goog
>