Re: [syzbot] WARNING in follow_hugetlb_page

From: Minchan Kim
Date: Sat May 21 2022 - 12:36:43 EST


On Sat, May 21, 2022 at 05:51:58PM +0200, David Hildenbrand wrote:
> On 21.05.22 17:24, Minchan Kim wrote:
> > On Fri, May 20, 2022 at 05:04:22PM -0700, Mike Kravetz wrote:
> >> On 5/20/22 16:43, Minchan Kim wrote:
> >>> On Fri, May 20, 2022 at 04:31:31PM -0700, Mike Kravetz wrote:
> >>>> On 5/20/22 15:56, John Hubbard wrote:
> >>>>> On 5/20/22 15:19, Minchan Kim wrote:
> >>>>>> The memory offline would be an issue so we shouldn't allow pinning of any
> >>>>>> pages in *movable zone*.
> >>>>>>
> >>>>>> Isn't alloc_contig_range just best effort? Then, it wouldn't be a big
> >>>>>> problem to allow pinning on those area. The matter is what target range
> >>>>>> on alloc_contig_range is backed by CMA or movable zone and usecases.
> >>>>>>
> >>>>>> IOW, movable zone should be never allowed. But CMA case, if pages
> >>>>>> are used by normal process memory instead of hugeTLB, we shouldn't
> >>>>>> allow longterm pinning since someone can claim those memory suddenly.
> >>>>>> However, we are fine to allow longterm pinning if the CMA memory
> >>>>>> already claimed and mapped at userspace(hugeTLB case IIUC).
> >>>>>>
> >>>>>
> >>>>> From Mike's comments and yours, plus a rather quick reading of some
> >>>>> CMA-related code in mm/hugetlb.c (free_gigantic_page(), alloc_gigantic_pages()), the following seems true:
> >>>>>
> >>>>> a) hugetlbfs can allocate pages *from* CMA, via cma_alloc()
> >>>>>
> >>>>> b) while hugetlbfs is using those CMA-allocated pages, it is debatable
> >>>>> whether those pages should be allowed to be long term pinned. That's
> >>>>> because there are two cases:
> >>>>>
> >>>>>     Case 1: pages are longterm pinned, then released, all while
> >>>>>             owned by hugetlbfs. No problem.
> >>>>>
> >>>>>     Case 2: pages are longterm pinned, but then hugetlbfs releases the
> >>>>>             pages entirely (via unmounting hugetlbfs, I presume). In
> >>>>>             this case, we now have CMA page that are long-term pinned,
> >>>>>             and that's the state we want to avoid.
> >>>>
> >>>> I do not think case 2 can happen. A hugetlb page can only be changed back
> >>>> to 'normal' (buddy) pages when ref count goes to zero.
> >>>>
> >>>> It should also be noted that hugetlb code sets up the CMA area from which
> >>>> hugetlb pages can be allocated. This area is never unreserved/freed.
> >>>>
> >>>> I do not think there is a reason to disallow long term pinning of hugetlb
> >>>> pages allocated from THE hugetlb CMA area.
>
> Hm. We primarily use CMA for gigantic pages only IIRC. Ordinary huge
> pages come via the buddy.
>
> Assume we allocated a (movable) 2MiB huge page ordinarily via the buddy
> and it ended up on that CMA area by pure luck (as it's movable). If we'd
> allow to pin it long-term, allocating a gigantic page from the
> designated CMA area would fail.

If we allow the longterm pin against the hugetlb page come via buddy,
it should be migrated out of CMA before the longterm pinning by
check_and_migrate_movable_pages, IIUC.
If so, what the allocating a giganitc page from the designated CMA area
would fail?

>
> So we'd want to allow long-term pinning a gigantic page but we'd not
> want to allow long-term pinning an ordinary huge page. We'd want to
> migrate the latter away.

Sure. Gigantic page was already CMA claimed page so there is no user
in the future to claim the memory again so fine to allow longterm pin
but ordinary huge page shouldn't be allowed since CMA owner could
claim the memory some day.

>
>
> The general rules are:
>
> ZONE_MOVABLE: nobody is allowed to place unmovable allocations there; it
> could prevent memory offlining/unplug.
>
> CMA: nobody *but the designated owner* is allowed to place unmovable
> memory there; it could prevent the actual owner to allocate contiguous
> memory.

I am confused what's the meaning of designated owner and actuall owner
in your context.

What I thought about the issue based on you explanation:

HugeTLB allocates its page by two types of allocation

1. alloc_pages(GFP_MOVABLE)

It could allocate the hugetlb page from CMA area but longterm pin
should migrate them out of cma before the pinning so allowing
the pinning on the page is no problem and current code works like
that.

check_and_migrate_movable_pages

2. cma_alloc

The cma_alloc is used only for *gigantic page* and the hugetlbfs
is the very owner of the page. IOW, if the hugetlbfs was succeeded
to allocate the gigantic page by cma_alloc, there is no other
owner to be able to claim the page any longer so it's fine to
allow longterm pinning againt the gingantic page but current.
However, current code doesn't work like that due to
is_pinnable_page. IOW, hugetlbfs need a way to distinguish
whether the page owner is hugetlbfs or not.

Are we on same page?