Re: [PATCH 01/51] CPU hotplug: Provide lockless versions ofcallback registration functions

From: Toshi Kani
Date: Mon Feb 10 2014 - 20:33:00 EST


On Thu, 2014-02-06 at 03:34 +0530, Srivatsa S. Bhat wrote:
:
> The problem here is that callback registration takes the locks in one order
> whereas the CPU hotplug operations take the same locks in the opposite order.
> To avoid this issue and to provide a race-free method to register CPU hotplug
> callbacks (along with initialization of already online CPUs), introduce new
> variants of the callback registration APIs that simply register the callbacks
> without holding the cpu_add_remove_lock during the registration. That way,
> we can avoid the ABBA scenario. However, we will need to hold the
> cpu_add_remove_lock throughout the entire critical section, to protect updates
> to the callback/notifier chain.
>
> This can be achieved by writing the callback registration code as follows:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* This doesn't take the cpu_add_remove_lock */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
> Note that we can't use get_online_cpus() here instead of cpu_maps_update_begin()
> because the cpu_hotplug.lock is dropped during the invocation of CPU_POST_DEAD
> notifiers, and hence get_online_cpus() cannot provide the necessary
> synchronization to protect the callback/notifier chains against concurrent
> reads and writes. On the other hand, since the cpu_add_remove_lock protects
> the entire hotplug operation (including CPU_POST_DEAD), we can use
> cpu_maps_update_begin/done() to guarantee proper synchronization.
>
> Also, since cpu_maps_update_begin/done() is like a super-set of
> get/put_online_cpus(), the former naturally protects the critical sections
> from concurrent hotplug operations.

get/put_online_cpus() is a reader-lock and concurrent executions are
allowed among the readers. They won't be serialized until a cpu
online/offline operation begins. By replacing this lock with
cpu_maps_update_begin/done(), we now serialize all readers. Isn't that
too restrictive? Can we fix the issue with CPU_POST_DEAD and continue
to use get_online_cpus()?

Thanks,
-Toshi



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/