Re: [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion

From: Paul E. McKenney
Date: Sun Jul 06 2025 - 13:01:23 EST


On Sat, Jul 05, 2025 at 01:23:27PM -0400, Joel Fernandes wrote:
> Recently while revising RCU's cpu online checks, there was some discussion
> around how IPIs synchronize with hotplug.
>
> Add comments explaining how preemption disable creates mutual exclusion with
> CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
> atomically updates CPU masks and flushes IPIs with interrupts disabled, and
> cannot proceed while any CPU (including the IPI sender) has preemption
> disabled.
>
> Cc: Andrea Righi <arighi@xxxxxxxxxx>
> Cc: Paul E. McKenney <paulmck@xxxxxxxxxx>
> Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Cc: rcu@xxxxxxxxxxxxxxx
> Co-developed-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Signed-off-by: Joel Fernandes <joelagnelf@xxxxxxxxxx>

Acked-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

> ---
> v1->v2: Reworded a bit more (minor nit).
>
> kernel/cpu.c | 4 ++++
> kernel/smp.c | 12 ++++++++++++
> 2 files changed, 16 insertions(+)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index a59e009e0be4..a8ce1395dd2c 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1310,6 +1310,10 @@ static int takedown_cpu(unsigned int cpu)
>
> /*
> * So now all preempt/rcu users must observe !cpu_active().
> + *
> + * stop_machine() waits for all CPUs to enable preemption. This lets
> + * take_cpu_down() atomically update CPU masks and flush last IPI
> + * before new IPIs can be attempted to be sent.
> */
> err = stop_machine_cpuslocked(take_cpu_down, NULL, cpumask_of(cpu));
> if (err) {
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 974f3a3962e8..842691467f9e 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -93,6 +93,9 @@ int smpcfd_dying_cpu(unsigned int cpu)
> * explicitly (without waiting for the IPIs to arrive), to
> * ensure that the outgoing CPU doesn't go offline with work
> * still pending.
> + *
> + * This runs in stop_machine's atomic context with interrupts disabled,
> + * thus CPU offlining and IPI flushing happen together atomically.
> */
> __flush_smp_call_function_queue(false);
> irq_work_run();
> @@ -418,6 +421,10 @@ void __smp_call_single_queue(int cpu, struct llist_node *node)
> */
> static int generic_exec_single(int cpu, call_single_data_t *csd)
> {
> + /*
> + * Preemption must be disabled by caller to mutually exclude with
> + * stop_machine() atomically updating CPU masks and flushing IPIs.
> + */
> if (cpu == smp_processor_id()) {
> smp_call_func_t func = csd->func;
> void *info = csd->info;
> @@ -640,6 +647,11 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
> /*
> * prevent preemption and reschedule on another processor,
> * as well as CPU removal
> + *
> + * get_cpu() disables preemption, ensuring mutual exclusion with
> + * stop_machine() where CPU offlining and last IPI flushing happen
> + * atomically versus this code. This guarantees here that the cpu_online()
> + * check and IPI sending are safe without losing IPIs due to offlining.
> */
> this_cpu = get_cpu();
>
> --
> 2.43.0
>