Re: [PATCH 04/15] mm/hmm: unregister mmu_notifier when last HMM client quit v2

From: John Hubbard
Date: Thu Mar 22 2018 - 18:47:23 EST


On 03/21/2018 04:41 PM, Jerome Glisse wrote:
> On Wed, Mar 21, 2018 at 04:22:49PM -0700, John Hubbard wrote:
>> On 03/21/2018 11:16 AM, jglisse@xxxxxxxxxx wrote:
>>> From: JÃrÃme Glisse <jglisse@xxxxxxxxxx>
>>>
>>> This code was lost in translation at one point. This properly call
>>> mmu_notifier_unregister_no_release() once last user is gone. This
>>> fix the zombie mm_struct as without this patch we do not drop the
>>> refcount we have on it.
>>>
>>> Changed since v1:
>>> - close race window between a last mirror unregistering and a new
>>> mirror registering, which could have lead to use after free()
>>> kind of bug
>>>
>>> Signed-off-by: JÃrÃme Glisse <jglisse@xxxxxxxxxx>
>>> Cc: Evgeny Baskakov <ebaskakov@xxxxxxxxxx>
>>> Cc: Ralph Campbell <rcampbell@xxxxxxxxxx>
>>> Cc: Mark Hairgrove <mhairgrove@xxxxxxxxxx>
>>> Cc: John Hubbard <jhubbard@xxxxxxxxxx>
>>> ---
>>> mm/hmm.c | 35 +++++++++++++++++++++++++++++++++--
>>> 1 file changed, 33 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/hmm.c b/mm/hmm.c
>>> index 6088fa6ed137..f75aa8df6e97 100644
>>> --- a/mm/hmm.c
>>> +++ b/mm/hmm.c
>>> @@ -222,13 +222,24 @@ int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm)
>>> if (!mm || !mirror || !mirror->ops)
>>> return -EINVAL;
>>>
>>> +again:
>>> mirror->hmm = hmm_register(mm);
>>> if (!mirror->hmm)
>>> return -ENOMEM;
>>>
>>> down_write(&mirror->hmm->mirrors_sem);
>>> - list_add(&mirror->list, &mirror->hmm->mirrors);
>>> - up_write(&mirror->hmm->mirrors_sem);
>>> + if (mirror->hmm->mm == NULL) {
>>> + /*
>>> + * A racing hmm_mirror_unregister() is about to destroy the hmm
>>> + * struct. Try again to allocate a new one.
>>> + */
>>> + up_write(&mirror->hmm->mirrors_sem);
>>> + mirror->hmm = NULL;
>>
>> This is being set outside of locks, so now there is another race with
>> another hmm_mirror_register...
>>
>> I'll take a moment and draft up what I have in mind here, which is a more
>> symmetrical locking scheme for these routines.
>>
>
> No this code is correct. hmm->mm is set after hmm struct is allocated
> and before it is public so no one can race with that. It is clear in
> hmm_mirror_unregister() under the write lock hence checking it here
> under that same lock is correct.

Are you implying that code that calls hmm_mirror_register() should do
it's own locking, to prevent simultaneous calls to that function? Because
as things are right now, multiple threads can arrive at this point. The
fact that mirror->hmm is not "public" is irrelevant; what matters is that
>1 thread can change it simultaneously.

thanks,
--
John Hubbard
NVIDIA