Re: get_online_cpus() && workqueues

From: Heiko Carstens
Date: Sun Apr 27 2008 - 08:22:41 EST


On Sat, Apr 26, 2008 at 06:43:30PM +0400, Oleg Nesterov wrote:
> Gautham, Srivatsa, seriously, can't we uglify cpu.c a little bit to solve
> the problem. Please see the illustration patch below. It looks complicated,
> but in fact it is quite trivial.
>
> In short: work_struct can't use get_online_cpus() due to deadlock with the
> CPU_DEAD phase.
>
> Can't we add another nested lock which is dropped right after __cpu_die()?
> (in fact I think it could be dropped after __stop_machine_run).
>
> The new read-lock is get_online_map() (just a random name for now). The only
> difference wrt get_online_cpus() is that it doesn't protect against CPU_DEAD,
> but most users of get_online_cpus() doesn't need this, they only need a
> stable cpu_online_map and sometimes they need to be sure that some per-cpu
> object (say, cpu_workqueue_struct->thread) can't be destroyed under this
> lock.
>
> get_online_map() seem to fit for this, and can be used from work->func().
> (actually, I think most users of use get_online_cpus() could use the new
> helper instead, but this doen't matter).
>
> Heiko, what do you think? Is it suitable for arch_reinit_sched_domains()?

Uhm, no. For arch_reinit_sched_domains that would allow for concurrent
callers for arch_init_sched_domains since sched.c calls that function in
quite a lot of the CPU_* phases (including CPU_DEAD) in update_sched_domains.
Not sure why it does that however.

But on the other hand there can be already concurrent callers via
sched_power_savings_store().

And with s390 calling arch_reinit_sched_domais() from outside there can be
yet another concurrent caller. Looks like the locking is broken anyway.
Sigh.

Looks like we need a new lock in arch_reinit_sched_domains() to prevent
concurrent callers to arch_init_sched_domains().
The calls from update_sched_domains() are implicitly prevented by the
cpu hotplug lock _and_ the fact that arch_reinit_sched_domains does
the get/put_online_cpus thing.

So conclusion is: the new get_online_map() wouldn't solve the deadlock
here, but we have a bug anyway :) Will see, if I can come up with a
tested patch tomorrow.

For the "don't call get_online_cpus() from within a work_struct" I have
the patch below. Even though I think it sucks. But at least it should
work.

arch/s390/kernel/topology.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff -urpN linux-2.6/arch/s390/kernel/topology.c linux-2.6-patched/arch/s390/kernel/topology.c
--- linux-2.6/arch/s390/kernel/topology.c 2008-04-25 14:16:25.000000000 +0200
+++ linux-2.6-patched/arch/s390/kernel/topology.c 2008-04-25 14:16:25.000000000 +0200
@@ -9,6 +9,7 @@
#include <linux/device.h>
#include <linux/bootmem.h>
#include <linux/sched.h>
+#include <linux/kthread.h>
#include <linux/workqueue.h>
#include <linux/cpu.h>
#include <linux/smp.h>
@@ -229,9 +230,20 @@ void arch_update_cpu_topology(void)
}
}

-static void topology_work_fn(struct work_struct *work)
+static int topology_kthread(void *data)
{
arch_reinit_sched_domains();
+ return 0;
+}
+
+static void topology_work_fn(struct work_struct *work)
+{
+ /* We can't call arch_reinit_sched_domains() from a multi-threaded
+ * workqueue context since it may deadlock in case of cpu hotplug.
+ * So we have to create a kernel thread in order to call
+ * arch_reinit_sched_domains().
+ */
+ kthread_run(topology_kthread, NULL, "topology_update");
}

void topology_schedule_update(void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/