Re: [syzbot] WARNING in follow_hugetlb_page

From: Minchan Kim
Date: Fri May 20 2022 - 18:19:58 EST


On Mon, May 16, 2022 at 08:37:01PM -0700, Mike Kravetz wrote:

< snip >

> > I need to look at this a little more closely, it is making me wonder
> > whether the is_pinnable_page() check is a problem in this path. The
> > comment in try_grab_folio() indicates that the early return is a hack
> > (it assumes that the caller is in the gup fast path), and maybe the hack
> > is just wrong here--I think we're actually on the slow gup path. Not
> > good.
> >
> > Mike, any thoughts here?
> >
>
> Do you know why try_grab_compound_page(now try_grab_folio) checks for
> pinnable when try_grab_page does not?
>
> Then I guess the next question is 'Should we allow pinning of hugetlb pages
> in these areas?'. My first thought would be no. But, recall it was 'allowed'
> until that commit which changed try_grab_page to try_grab_compound_page.

The reason we don't allow longterm pinning in CMA area is to improve
big contigus memory allocation sccuess ratio when someone claim the memory
space. Thus, any pages mapped at userspace given the CMA area shouldn't be
pinned with longterm. Otherwise, the cma_alloc will fail due to migration
failure.

In hugetlb case(I might miss something..), the CMA memory was already
claimed by hugeTLB and the big contiguous memory was mapped at userspace
so there is no reason to prevent longterm pinning since HugeTLB will
never claim those CMA memory until user release the memory and HugeTLB
free the range using cma_release.

> In the 'common' case of compaction, we do not attempt to migrate/move hugetlb
> pages (last time I looked), so long term pinning should not be an issue.
> However, for things like memory offline or alloc_contig_range() we want to

The memory offline would be an issue so we shouldn't allow pinning of any
pages in *movable zone*.

Isn't alloc_contig_range just best effort? Then, it wouldn't be a big
problem to allow pinning on those area. The matter is what target range
on alloc_contig_range is backed by CMA or movable zone and usecases.

IOW, movable zone should be never allowed. But CMA case, if pages
are used by normal process memory instead of hugeTLB, we shouldn't
allow longterm pinning since someone can claim those memory suddenly.
However, we are fine to allow longterm pinning if the CMA memory
already claimed and mapped at userspace(hugeTLB case IIUC).

Please correct me if I miss something.

Thanks.