Re: [PATCH 02/13] mm: Revalidate anon_vma in page_lock_anon_vma()

From: Minchan Kim
Date: Fri Apr 09 2010 - 04:01:26 EST


Hi, Kosaki.

On Fri, Apr 9, 2010 at 4:29 PM, KOSAKI Motohiro
<kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>> Hmm, I think following.
>>
>> Assume a page is ANON and SwapCache, and it has only one reference.
>> Consider it's read-only mapped and cause do_wp_page().
>> page_mapcount(page) == 1 here.
>>
>> Â Â CPU0 Â Â Â Â Â Â Â Â Â Â Â Â ÂCPU1
>>
>> 1. do_wp_page()
>> 2. .....
>> 3. replace anon_vma. Â Â anon_vma = lock_page_anon_vma()
>>
>> So, lock_page_anon_vma() may have lock on wrong anon_vma, here.(mapcount=1)
>>
>> 4. modify pte to writable. Â Â Â Âdo something...
>>
>> After lock, in CPU1, a pte of estimated address by vma_address(vma, page)
>> containes pfn of the page and page_check_address() will success.
>>
>> I'm not sure how this is dangerouns.
>> But it's possible that CPU1 cannot notice there was anon_vma replacement.
>> And modifies pte withoug holding anon vma's lock which the code believes
>> it's holded.
>
>
> Hehe, page_referenced() already can take unstable VM_LOCKED value. So,
> In worst case we make false positive pageout, but it's not disaster.

OFF-TOPIC:

I think you pointed out good thing, too. :)

You mean although application call mlock of any vma, few pages on the vma can
be swapout by race between mlock and reclaim?

Although it's not disaster, apparently it breaks API.
Man page
" mlock() and munlock()
mlock() locks pages in the address range starting at addr and
continuing for len bytes. All pages that contain a part of the
specified address range are guaranteed to be resident in RAM when the
call returns successfully; the pages are guaranteed to stay in RAM
until later unlocked."

Do you have a plan to solve such problem?

And how about adding simple comment about that race in page_referenced_one?
Could you send the patch?




--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/