Re: [PATCH 2/2] cpuhotplug: introduce try_get_online_cpus()

From: Paul E. McKenney
Date: Wed Jun 03 2009 - 20:17:00 EST


On Mon, Jun 01, 2009 at 09:19:31AM -0700, Paul E. McKenney wrote:
> On Mon, Jun 01, 2009 at 05:01:50PM +0930, Rusty Russell wrote:
> > On Sat, 30 May 2009 06:01:18 am Andrew Morton wrote:
> > > I do think that we should look at
> > > alternative (non-trylocky) ways of fixing them.
> >
> > Speculating: we could add a "keep_cpu()" (FIXME: improve name) which is kind
> > of like get_cpu() only doesn't disable preemption and only stops *this* cpu
> > from going down.
> >
> > Not sure where that gets us, but if someone's going to dig deep into this it
> > might help.
>
> I have been beating up on the approach of disabling preemption to pin down
> a single CPU, and although it is working, it is no faster than simply
> doing get_online_cpus() and it is much much more subtle and complex.
> I am not sure that I have all the races properly accounted for, and I
> am failing to see the point of having something quite this ugly in the
> kernel when much simpler alternatives exist.
>
> The main vulnerability is the possibility that someone will invoke
> synchroniize_rcu_expedited() while holding a mutex that is also acquired
> in a CPU-hotplug notifier, as Lai noted. But this is easily handled
> given a primitive that will say whether the current CPU is executing in a
> CPU-hotplug notifier. This primitive is permitted to sometimes mistakenly
> say that the current CPU is executing in a CPU-hotplug notifier when it
> is not (as long as it doesn't do so too often), but not vice versa.
>
> One way to implement this would be to have such a primitive simply say
> whether or not a CPU-hotplug operation is currently in effect. Yes, this
> is racy, but not when it matters -- you cannot possibly exit a CPU-hotplug
> operation while executing in a CPU-hotplug notifier. For example,
> the following exported from kernel/cpu.c would work just fine:
>
> bool cpu_hotplug_in_progress(void)
> {
> return cpu_hotplug.active_writer != NULL;
> }
>
> I believe that we should be OK moving forward with an updated version of
> http://lkml.org/lkml/2009/5/22/332 even without the deadlock avoidance.
> Having the deadlock avoidance would be better, of course, so I will use
> something like the above on the next patch.

Of course, the above does not actually solve the deadlock, instead
merely making it less likely to occur. I have absolutely no idea what
I was thinking!

Back to try_get_online_cpus().

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/