Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem

From: Davidlohr Bueso
Date: Tue Jul 04 2017 - 11:02:03 EST


On Tue, 04 Jul 2017, Thomas Gleixner wrote:

Andrey reported a potential deadlock with the memory hotplug lock and the
cpu hotplug lock.

The reason is that memory hotplug takes the memory hotplug lock and then
calls stop_machine() which calls get_online_cpus(). That's the reverse lock
order to get_online_cpus(); get_online_mems(); in mm/slub_common.c

The problem has been there forever. The reason why this was never reported
is that the cpu hotplug locking had this homebrewn recursive reader writer
semaphore construct which due to the recursion evaded the full lock dep
coverage. The memory hotplug code copied that construct verbatim and
therefor has similar issues.

Three steps to fix this:

1) Convert the memory hotplug locking to a per cpu rwsem so the potential
issues get reported proper by lockdep.

I particularly like how the mem hotplug is well suited for pcpu-rwsem.
As a side effect you end up optimizing get/put_online_mems() at the cost
of more overhead for the actual hotplug operation, which is rare and of less
performance importance.

Thanks,
Davidlohr