Re: [PATCH v4 11/23] device-dax: Kill dax_kmem_res

From: David Hildenbrand
Date: Fri Sep 25 2020 - 04:55:01 EST


On 24.09.20 23:50, Dan Williams wrote:
> On Thu, Sep 24, 2020 at 2:42 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
>>
>>
>>
>>> Am 24.09.2020 um 23:26 schrieb Dan Williams <dan.j.williams@xxxxxxxxx>:
>>>
>>> [..]
>>>>> I'm not suggesting to busy the whole "virtio" range, just the portion
>>>>> that's about to be passed to add_memory_driver_managed().
>>>>
>>>> I'm afraid I don't get your point. For virtio-mem:
>>>>
>>>> Before:
>>>>
>>>> 1. Create virtio0 container resource
>>>>
>>>> 2. (somewhen in the future) add_memory_driver_managed()
>>>> - Create resource (System RAM (virtio_mem)), marking it busy/driver
>>>> managed
>>>>
>>>> After:
>>>>
>>>> 1. Create virtio0 container resource
>>>>
>>>> 2. (somewhen in the future) Create resource (System RAM (virtio_mem)),
>>>> marking it busy/driver managed
>>>> 3. add_memory_driver_managed()
>>>>
>>>> Not helpful or simpler IMHO.
>>>
>>> The concern I'm trying to address is the theoretical race window and
>>> layering violation in this sequence in the kmem driver:
>>>
>>> 1/ res = request_mem_region(...);
>>> 2/ res->flags = IORESOURCE_MEM;
>>> 3/ add_memory_driver_managed();
>>>
>>> Between 2/ and 3/ something can race and think that it owns the
>>> region. Do I think it will happen in practice, no, but it's still a
>>> pattern that deserves come cleanup.
>>
>> I think in that unlikely event (rather impossible), add_memory_driver_managed() should fail, detecting a conflicting (busy) resource. Not sure what will happen next ( and did not double-check).
>
> add_memory_driver_managed() will fail, but the release_mem_region() in
> kmem to unwind on the error path will do the wrong thing because that
> other driver thinks it got ownership of the region.
>

I think if somebody would race and claim the region for itself (after we
unchecked the BUSY flag), there would be another memory resource below
our resource container (e.g., via __request_region()).

So, interestingly, the current code will do a

release_resource->__release_resource(old, true);

which will remove whatever somebody added below the resource.

If we were to do a

remove_resource->__release_resource(old, false);

we would only remove what we temporarily added, relocating anychildren
(someone nasty added).

But yeah, I don't think we have to worry about this case.

>> But yeah, the way the BUSY bit is cleared here is wrong - simply overwriting other bits. And it would be even better if we could avoid manually messing with flags here.
>
> I'm ok to leave it alone for now (hasn't been and likely never will be
> a problem in practice), but I think it was still worth grumbling

Definitely, it gives us a better understanding.

> about. I'll leave that part of kmem alone in the upcoming split of
> dax_kmem_res removal.

Yeah, stuff is more complicated than I would wished, so I guess it's
better to leave it alone for now until we actually see issues with
somebody else regarding *our* device-owned region (or we're able to come
up with a cleanup that keeps all corner cases working for kmem and
virtio-mem).

--
Thanks,

David / dhildenb