Re: [RFC][PATCH 0/5] perf/tracing/cpuhotplug: Fix locking order

From: Steven Rostedt
Date: Fri May 12 2017 - 17:34:57 EST


On Fri, 12 May 2017 21:49:56 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Fri, May 12, 2017 at 01:15:44PM -0400, Steven Rostedt wrote:
> > 2) Allow for get_online_cpus() to nest
>
> So Thomas and me have been avoiding doing this.
>
> In general we avoid nested locking in the kernel. Nested locking makes
> an absolute mockery of locking rules and what all gets protected.
>
> Yes, its much easier.. But we managed to kill the BKL, so surely we can
> fix the hotplug lock too, right ;-)

Actually, talking this over with tglx. We may be able to simplify this
a lot without the goc nesting, by removing the requirement of taking
get_online_cpus() before text_mutex. Reading the comment in kprobes.c:

/*
* The optimization/unoptimization refers online_cpus via
* stop_machine() and cpu-hotplug modifies online_cpus.
* And same time, text_mutex will be held in cpu-hotplug and here.
* This combination can cause a deadlock (cpu-hotplug try to lock
* text_mutex but stop_machine can not be done because online_cpus
* has been changed)
* To avoid this deadlock, we need to call get_online_cpus()
* for preventing cpu-hotplug outside of text_mutex locking.
*/
get_online_cpus();
mutex_lock(&text_mutex);

Back when this was added, when we offlined all but one CPU, we would
call into alternatives to switch the system into "uniprocessor" mode.
Turning spinlocks into nops and such. And when we onlined a CPU again,
it would do the same thing in reverse (converting those nops back to
spinlocks). Paul showed Rusty that it was causing huge overheads in
onlining and offlining CPUs and it seemed a bit ridiculous to do so
(although I have to say it was a cool feature). And finally we stop
doing that.

This means that text_mutex, which was taken by the alternative code, no
longer is taken in cpu hotplug code. That means there's no longer a
deadlock scenario, as we don't have anyplace(*) that grabs
get_online_cpus() and takes the text_mutex. Removing that will
simplify things tremendously!

(*) with one exception: perf.

Currently perf does a get_online_cpu() at a high level. Will it be
possible to move that down, such that we don't have it taken when we do
any software events?

-- Steve