Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem

From: Vlastimil Babka
Date: Tue Jul 04 2017 - 08:11:08 EST


On 07/04/2017 11:32 AM, Thomas Gleixner wrote:
> Andrey reported a potential deadlock with the memory hotplug lock and the
> cpu hotplug lock.
>
> The reason is that memory hotplug takes the memory hotplug lock and then
> calls stop_machine() which calls get_online_cpus(). That's the reverse lock
> order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
>
> The problem has been there forever. The reason why this was never reported
> is that the cpu hotplug locking had this homebrewn recursive reader writer
> semaphore construct which due to the recursion evaded the full lock dep
> coverage. The memory hotplug code copied that construct verbatim and
> therefor has similar issues.
>
> Three steps to fix this:
>
> 1) Convert the memory hotplug locking to a per cpu rwsem so the potential
> issues get reported proper by lockdep.
>
> 2) Lock the online cpus in mem_hotplug_begin() before taking the memory
> hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc code
> and use to avoid recursive locking.

^ s/and use // ?

>
> 3) The cpu hotpluck locking in #2 causes a recursive locking of the cpu
> hotplug lock via __offline_pages() -> lru_add_drain_all(). Solve this by
> invoking lru_add_drain_all_cpuslocked() instead.
>
> Reported-by: Andrey Ryabinin <aryabinin@xxxxxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxxxx>
> Cc: linux-mm@xxxxxxxxx
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx>

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

> ---
> mm/memory_hotplug.c | 89 ++++++++--------------------------------------------
> mm/page_alloc.c | 2 -
> 2 files changed, 16 insertions(+), 75 deletions(-)

Nice! Glad to see the crazy code go.