Re: Deadlock between cpu_hotplug_begin and cpu_add_remove_lock

From: Rusty Russell
Date: Thu Jan 23 2014 - 18:39:30 EST


"Srivatsa S. Bhat" <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> writes:
> On 01/23/2014 07:59 AM, Rusty Russell wrote:
>> "Srivatsa S. Bhat" <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> writes:
>>> On 01/22/2014 02:00 PM, Srivatsa S. Bhat wrote:
>>>> Hi Paul,
>>
>> I find an old patch for register_allcpu_notifier(), but the "bool
>> replay_history" should be eliminated (always true): it's too weird.
>>
>
> Sorry, I didn't get this part. Why do you say that replay_history
> will always be true?

OK, let me start again and try to explain myself properly:

register_cpu_notifier is a bad API. It's hard to get right because:
1) You need to loop over online (or present) cpus once before you call
it.
2) You have to beware the race between the loop and registration, but
much example code happens at boot time where it doesn't matter,
so random author is likely to copy that and have a race.
3) You have two paths doing the same thing: the loop which is run on
every machine (cpu hotplug or not), and the notifier callback which
is run far less rarely.

What we actually *want* is a routine which will reliably call for every
current and future CPU, and then there are very few places which should
use the current register_cpu_notifier().

ie. halfway between register_cpu_notifier() (too racy) and
register_allcpu_notifier() (too simplified).

Let's call it register_cpu_callback / unregister_cpu_callback?

> By the way, I'm still tempted to try out the simpler-looking alternative
> idea of exporting cpu_maps_update_begin() and cpu_maps_update_done()
> and then mandating that the callers do:
>
> cpu_maps_update_begin();
> for_each_online_cpu(cpu) {
> ...
> }
>
> __register_cpu_notifier(); // this doesn't take the add_remove_lock
> cpu_maps_update_done();

Sure, fix this one for -stable. But let's create an idiom we can be
proud of for the longer term.

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/