Re: [PATCH v2 0/2] A couple hugetlbfs fixes

From: Mike Kravetz
Date: Mon Apr 08 2019 - 23:30:32 EST


On 4/8/19 12:48 PM, Davidlohr Bueso wrote:
> On Thu, 28 Mar 2019, Mike Kravetz wrote:
>
>> - A BUG can be triggered (not easily) due to temporarily mapping a
>> page before doing a COW.
>
> But you actually _have_ seen it? Do you have the traces? I ask
> not because of the patches perse, but because it would be nice
> to have a real snipplet in the Changelog for patch 2.

Yes, I actually saw this problem. It happened while I was debugging and
testing some patches for hugetlb migration. The BUG I hit was in
unaccount_page_cache_page(): VM_BUG_ON_PAGE(page_mapped(page), page).

Stack trace was something like:
unaccount_page_cache_page
__delete_from_page_cache
delete_from_page_cache
remove_huge_page
remove_inode_hugepages
hugetlbfs_punch_hole
hugetlbfs_fallocate

When I hit that, it took me a while to figure out how it could happen.
i.e. How could a page be mapped at that point in remove_inode_hugepages?
It checks page_mapped and we are holding the fault mutex. With some
additional debug code (strategic udelays) I could hit the issue on a
somewhat regular basis and verified another thread was in the
hugetlb_no_page/hugetlb_cow path for the same page at the same time.

Unfortunately, I did not save the traces. I am trying to recreate now.
However, my test system was recently updated and it might take a little
time to recreate.
--
Mike Kravetz