Re: [v2 PATCH 2/2] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

From: Vlastimil Babka
Date: Wed Jul 17 2019 - 14:50:18 EST


On 7/17/19 8:23 PM, Yang Shi wrote:
>
>
> On 7/16/19 10:28 AM, Yang Shi wrote:
>>
>>
>> On 7/16/19 5:07 AM, Vlastimil Babka wrote:
>>> On 6/22/19 2:20 AM, Yang Shi wrote:
>>>> @@ -969,10 +975,21 @@ static long do_get_mempolicy(int *policy,
>>>> nodemask_t *nmask,
>>>> Â /*
>>>> ÂÂ * page migration, thp tail pages can be passed.
>>>> ÂÂ */
>>>> -static void migrate_page_add(struct page *page, struct list_head
>>>> *pagelist,
>>>> +static int migrate_page_add(struct page *page, struct list_head
>>>> *pagelist,
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ unsigned long flags)
>>>> Â {
>>>> ÂÂÂÂÂ struct page *head = compound_head(page);
>>>> +
>>>> +ÂÂÂ /*
>>>> + * Non-movable page may reach here. And, there may be
>>>> +ÂÂÂÂ * temporaty off LRU pages or non-LRU movable pages.
>>>> +ÂÂÂÂ * Treat them as unmovable pages since they can't be
>>>> + * isolated, so they can't be moved at the moment. It
>>>> +ÂÂÂÂ * should return -EIO for this case too.
>>>> +ÂÂÂÂ */
>>>> +ÂÂÂ if (!PageLRU(head) && (flags & MPOL_MF_STRICT))
>>>> +ÂÂÂÂÂÂÂ return -EIO;
>>>> +
>>> Hm but !PageLRU() is not the only way why queueing for migration can
>>> fail, as can be seen from the rest of the function. Shouldn't all cases
>>> be reported?
>>
>> Do you mean the shared pages and isolation failed pages? I'm not sure
>> whether we should consider these cases break the semantics or not, so
>> I leave them as they are. But, strictly speaking they should be
>> reported too, at least for the isolation failed page.

CC'd linux-api, should be done on v3 posting also.

> By reading mbind man page, it says:
>
> If MPOL_MF_MOVE is specified in flags, then the kernel will attempt to
> move all the existing pages in the memory range so that they follow the
> policy. Pages that are shared with other processes will not be moved.Â
> If MPOL_MF_STRICT is also specified, then the call fails with the error
> EIO if some pages could not be moved.

I don't think this means that for shared pages, -EIO should not be
reported. I can imagine both interpretations of the paragraph. I guess
we can be conservative and keep not reporting them, if that was always
the case - but then perhaps clarify the man page?

> It looks the code already handles shared page correctly, we just need
> return -EIO for isolation failed page if MPOL_MF_STRICT is specified.
>
>>
>> Thanks,
>> Yang
>>
>>>
>>>> ÂÂÂÂÂ /*
>>>> ÂÂÂÂÂÂ * Avoid migrating a page that is shared with others.
>>>> ÂÂÂÂÂÂ */
>>>> @@ -984,6 +1001,8 @@ static void migrate_page_add(struct page *page,
>>>> struct list_head *pagelist,
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ hpage_nr_pages(head));
>>>> ÂÂÂÂÂÂÂÂÂ }
>>>> ÂÂÂÂÂ }
>>>> +
>>>> +ÂÂÂ return 0;
>>>> Â }
>>>> Â Â /* page allocation callback for NUMA node migration */
>>>> @@ -1186,9 +1205,10 @@ static struct page *new_page(struct page
>>>> *page, unsigned long start)
>>>> Â }
>>>> Â #else
>>>> Â -static void migrate_page_add(struct page *page, struct list_head
>>>> *pagelist,
>>>> +static int migrate_page_add(struct page *page, struct list_head
>>>> *pagelist,
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ unsigned long flags)
>>>> Â {
>>>> +ÂÂÂ return -EIO;
>>>> Â }
>>>> Â Â int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
>>>>
>>
>