Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, done}

From: Dan Williams
Date: Wed Mar 01 2017 - 17:56:28 EST


On Wed, Mar 1, 2017 at 9:04 AM, Heiko Carstens
<heiko.carstens@xxxxxxxxxx> wrote:
> On Wed, Mar 01, 2017 at 07:52:18AM -0800, Dan Williams wrote:
>> On Wed, Mar 1, 2017 at 4:51 AM, Heiko Carstens
>> <heiko.carstens@xxxxxxxxxx> wrote:
>> > Since it is anything but obvious why Dan wrote in changelog of b5d24fda9c3d
>> > ("mm, devm_memremap_pages: hold device_hotplug lock over
>> > mem_hotplug_{begin, done}") that write accesses to
>> > mem_hotplug.active_writer are coordinated via lock_device_hotplug() I'd
>> > rather propose a new private memory_add_remove_lock which has similar
>> > semantics like the cpu_add_remove_lock for cpu hotplug (see patch below).
>> >
>> > However instead of sprinkling locking/unlocking of that new lock around all
>> > calls of mem_hotplug_begin() and mem_hotplug_end() simply include locking
>> > and unlocking into these two functions.
>> >
>> > This still allows get_online_mems() and put_online_mems() to work, while at
>> > the same time preventing mem_hotplug.active_writer corruption.
>> >
>> > Any opinions?
>>
>> Sorry, yes, I didn't make it clear that I derived that locking
>> requirement from store_mem_state() and its usage of
>> lock_device_hotplug_sysfs().
>>
>> That routine is trying very hard not trip the soft-lockup detector. It
>> seems like that wants to be an interruptible wait.
>
> If you look at commit 5e33bc4165f3 ("driver core / ACPI: Avoid device hot
> remove locking issues") then lock_device_hotplug_sysfs() was introduced to
> avoid a different subtle deadlock, but it also sleeps uninterruptible, but
> not for more than 5ms ;)
>
> However I'm not sure if the device hotplug lock should also be used to fix
> an unrelated bug that was introduced with the get_online_mems() /
> put_online_mems() interface. Should it?

No, I don't think it should.

I like your proposed direction of creating a new lock internal to
mem_hotplug_begin() to protect active_writer, and stop relying on
lock_device_hotplug to serve this purpose.

> If so, we need to sprinkle around a couple of lock_device_hotplug() calls
> near mem_hotplug_begin() calls, like Sebastian already started, and give it
> additional semantics (protecting mem_hotplug.active_writer), and hope it
> doesn't lead to deadlocks anywhere.

I'll put your proposed patch through some testing.