Re: [PATCH] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY

From: Peter Xu
Date: Wed May 12 2021 - 16:37:13 EST


Mina,

On Wed, May 12, 2021 at 12:42:32PM -0700, Mina Almasry wrote:
> > >> @@ -4868,30 +4869,39 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> > >> struct page **pagep)
> > >> {
> > >> bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE);
> > >> - struct address_space *mapping;
> > >> - pgoff_t idx;
> > >> + struct hstate *h = hstate_vma(dst_vma);
> > >> + struct address_space *mapping = dst_vma->vm_file->f_mapping;
> > >> + pgoff_t idx = vma_hugecache_offset(h, dst_vma, dst_addr);
> > >> unsigned long size;
> > >> int vm_shared = dst_vma->vm_flags & VM_SHARED;
> > >> - struct hstate *h = hstate_vma(dst_vma);
> > >> pte_t _dst_pte;
> > >> spinlock_t *ptl;
> > >> - int ret;
> > >> + int ret = -ENOMEM;
> > >> struct page *page;
> > >> int writable;
> > >>
> > >> - mapping = dst_vma->vm_file->f_mapping;
> > >> - idx = vma_hugecache_offset(h, dst_vma, dst_addr);
> > >> + /* Out parameter. */
> > >> + WARN_ON(*pagep);
> > >
> > > I don't think this warning works, because we do set *pagep, in the
> > > copy_huge_page_from_user failure case. In that case, the following
> > > happens:
> > >
> > > 1. We set *pagep, and return immediately.
> > > 2. Our caller notices this particular error, drops mmap_lock, and then
> > > calls us again with *pagep set.
> > >
> > > In this path, we're supposed to just re-use this existing *pagep
> > > instead of allocating a second new page.
> > >
> > > I think this also means we need to keep the "else" case where *pagep
> > > is set below.
> > >
> >
> > +1 to Peter's comment.
> >
>
> Gah, sorry about that. I'll fix in v2.

I have a question regarding v1: how do you guarantee huge_add_to_page_cache()
won't fail again even if checked before page alloc? Say, what if the page
cache got inserted after hugetlbfs_pagecache_present() (which is newly added in
your v1) but before huge_add_to_page_cache()?

I also have a feeling that we've been trying to work around something else, but
I can't tell yet as I'll probably need to read a bit more/better on how hugetlb
does the accounting and also on reservations.

Thanks,

--
Peter Xu