Re: [PATCH 2/3] work_on_cpu: Use our own workqueue.

From: Ingo Molnar
Date: Mon Jan 26 2009 - 12:16:53 EST



* Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> > > Yet another kernel thread for each CPU. All because of some dung
> > > way down in arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c.
> > >
> > > Is there no other way?
> >
> > Perhaps, but this works. Trying to be clever got me into this mess in
> > the first place.
> >
> > We could stop using workqueues and change work_on_cpu to create a
> > thread every time, which would give it a new failure mode so I don't
> > know that everyone could use it any more. Or we could keep a single
> > thread around to do all the cpus, and duplicate much of the workqueue
> > code.
> >
> > None of these options are appealing...
>
> Can we try harder please? 10 screenfuls of kernel threads in the ps
> output is just irritating.
>
> How about banning the use of work_on_cpu() from schedule_work() handlers
> and then fixing that driver somehow?

Yes, but that's fundamentally fragile: anyone who happens to stick the
wrong thing into keventd (and it's dead easy because schedule_work() is
easy to use) will lock up work_on_cpu() users.

work_on_cpu() is an important (and lowlevel enough) facility to be
isolated from casual interaction like that.

> What _is_ the bug anyway? The only description we were given was
>
> Impact: remove potential clashes with generic kevent workqueue
>
> Annoyingly, some places we want to use work_on_cpu are already in
> workqueues. As per Ingo's suggestion, we create a different
> workqueue for work_on_cpu.
>
> which didn't bother telling anyone squat.
>
> When was this bug added? Was it added into that driver or was it due to
> infrastructural changes?

This fixes lockups during bootup caused by the cpumask changes/cleanups
which changed set_cpus_allowed()+on-kernel-stack-cpumask_t to
work_on_cpu().

Which was fine except it didnt take into account the interaction with the
kevents workqueue and the very wide cross section for worklet dependencies
that this brings with itself. work_on_cpu() was rarely used before so this
didnt show up.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/