Re: [PATCH] cpu/hotplug: Set st->cpu earlier

From: Vincent Donnefort
Date: Mon Mar 07 2022 - 10:41:54 EST


On 25/02/2022 13:49, Steven Price wrote:
Setting the 'cpu' member of struct cpuhp_cpu_state in cpuhp_create() is
too late as other callbacks can be made before that point. In particular > if one of the earlier callbacks fails and triggers a rollback that
rollback will be done with st->cpu==0 causing CPU0 to be erroneously set

st->cpu is even needed before any cpuhp_step callback has been run (cpuhp_set_state() in _cpu_up()). So despite CPUHP_CREATE_THREADS being the first step, this is indeed not early enough.

to be dying, causing the scheduler to get mightily confused and throw
its toys out of the pram.

Move the assignment earlier before any callbacks have a chance to run.

Probably needs a

Fixes: 2ea46c6fc945 ("cpumask/hotplug: Fix cpu_dying() state tracking")


Signed-off-by: Steven Price <steven.price@xxxxxxx>
CC: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
---
This was initially triggered by a VM which didn't have enough memory for
its VCPUs, but an easier way of triggering it is to make a change like
below in __smpboot_create_thread (as suggested by Dietmar Eggemann) to
pretend the memory allocation fails for a particular CPU:

td = kzalloc_node(sizeof(*td), GFP_KERNEL, cpu_to_node(cpu));
- if (!td)
+ if (!td || cpu == 1)
return -ENOMEM;

I'm not entirely sure quite where the best place to set st->cpu is, so
please do let me know if there's a better place to do the assignment.
---
kernel/cpu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 407a2568f35e..49c3ef6067e5 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -720,7 +720,6 @@ static void cpuhp_create(unsigned int cpu)
init_completion(&st->done_up);
init_completion(&st->done_down);
- st->cpu = cpu;
}
static int cpuhp_should_run(unsigned int cpu)
@@ -1333,6 +1332,8 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
goto out;
}
+ st->cpu = cpu;
+

Could eventually go just before cpuhp_set_state(), in the same function as this seems to be the first user of st->cpu.

/*
* The caller of cpu_up() might have raced with another
* caller. Nothing to do.