Re: [PATCH v2] sched/core: fix illegal RCU from offline CPUs

From: Peter Zijlstra
Date: Tue Jan 21 2020 - 05:35:31 EST


On Mon, Jan 20, 2020 at 03:35:09PM -0500, Qian Cai wrote:
> > On Jan 20, 2020, at 5:17 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > Bah.. that's horrible. Surely we can find a better place to do this in
> > the whole hotplug machinery.
> >
> > Perhaps you can have takedown_cpu() do the mmdrop()?
>
> The problem is that no all arch_cpu_idle_dead() will call
> idle_task_exit(). For example, alpha and parisc are not, so it needs

How is that not broken? If the idle thread runs on an active_mm, we need
to drop that reference. This needs fixing regardless.

> to deal with some kind of ifdef dance in takedown_cpu() to
> conditionally call mmdrop() which sounds even more horrible?
>
> If you really prefer it anyway, maybe something like touching every archâs __cpu_die() to also call mmdrop() depends on arches?
>
> BTW, how to obtain the other CPUâs current task mm? Is that cpu_rq(cpu)->curr->active_mm?

Something like this; except you'll need to go audit archs to make sure
they all call idle_task_exit() and/or put in comments on why they don't
have to (perhaps their bringup switches them to &init_mm unconditionally
and the switch_mm() is not required).

---
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 9c706af713fb..2b4d8a69e8d9 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -564,6 +564,23 @@ static int bringup_cpu(unsigned int cpu)
return bringup_wait_for_ap(cpu);
}

+static int finish_cpu(unsigned int cpu)
+{
+ struct task_struct *idle = idle_thread_get(cpu);
+ struct mm_struct *mm = idle->active_mm;
+
+ /*
+ * idle_task_exit() will have switched to &init_mm, now
+ * clean up any remaining active_mm state.
+ */
+
+ if (mm == &init_mm)
+ return;
+
+ idle->active_mm = &init_mm;
+ mmdrop(mm);
+}
+
/*
* Hotplug state machine related functions
*/
@@ -1434,7 +1451,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
[CPUHP_BRINGUP_CPU] = {
.name = "cpu:bringup",
.startup.single = bringup_cpu,
- .teardown.single = NULL,
+ .teardown.single = finish_cpu,
.cant_stop = true,
},
/* Final state before CPU kills itself */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index fc1dfc007604..8f049fb77a3d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6188,13 +6188,14 @@ void idle_task_exit(void)
struct mm_struct *mm = current->active_mm;

BUG_ON(cpu_online(smp_processor_id()));
+ BUG_ON(current != this_rq()->idle);

if (mm != &init_mm) {
switch_mm(mm, &init_mm, current);
- current->active_mm = &init_mm;
finish_arch_post_lock_switch();
}
- mmdrop(mm);
+
+ /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
}

/*