Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based broadcasting

From: Nicolas Pitre
Date: Mon Feb 23 2015 - 11:33:05 EST


On Mon, 23 Feb 2015, Peter Zijlstra wrote:

> In any case, having had a second look I think I might have some ideas:
>
> - bL_switcher_enable() -- enables the whole switcher thing and
> disables half the cpus with hot-un-plug, creates a mapping etc..
>
> - bL_switcher_disable() -- disabled the whole switcher thing and
> gives us back all our cpus with hot-plug.
>
> When the switcher is enabled; we switch by this magic cpu_suspend() call
> that saves the entire cpu state and allows you to restore it on another
> cpu.
>
> You muck about with the tick; you disable it before cpu_suspend() and
> re-enable it after on the target cpu. You further reprogram the
> interrupt routing from the old to the new cpu.
>
> But that appears to be it, no more.

Exact.

> I suppose the tick is special because its the only per-cpu device?

Right.

> The reported function that fails: bL_switcher_restore_cpus() is called
> in the error paths of the former and the main path in the latter to make
> the 'stolen' cpus re-appear.
>
> The patch in question somehow makes that go boom.
>
>
> Now what all do you need to do to make it go boom? Just enable/disable
> the switcher once and it'll explode? Or does it need to do actual
> switches while it is enabled?

It gets automatically enabled during boot. Then several switches are
performed while user space is brought up. If I manually disable it
via /sys then it goes boom.

> The place where it explodes is a bit surprising, it thinks hrtimers are
> not enabled even though its calling into hrtimer code on that cpu...
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/