Re: [PATCH v1] mm, hugepages: add mremap() support for hugepage backed vma

From: Mike Kravetz
Date: Thu Aug 05 2021 - 12:56:33 EST


On 8/4/21 11:03 AM, Mina Almasry wrote:
> On Mon, Aug 2, 2021 at 4:51 PM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote:
>> On 7/30/21 3:15 PM, Mina Almasry wrote:
>>> From: Ken Chen <kenchen@xxxxxxxxxx>
>>> +static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr,
>>> + unsigned long new_addr, pte_t *src_pte)
>>> +{
>>> + struct address_space *mapping = vma->vm_file->f_mapping;
>>> + struct hstate *h = hstate_vma(vma);
>>> + struct mm_struct *mm = vma->vm_mm;
>>> + pte_t *dst_pte, pte;
>>> + spinlock_t *src_ptl, *dst_ptl;
>>> +
>>> + /* Shared pagetables need more thought here if we re-enable them */
>>> + BUG_ON(vma_shareable(vma, old_addr));
>>
>> I agree that shared page tables will complicate the code. Where do you
>> actually prevent mremap on mappings which can share page tables? I
>> don't see anything before this BUG.
>>
>
> Sorry, I added a check in mremap to return early if
> hugetlb_vma_sharable() in v2.
>

After thinking about this a bit, I am not sure if this is a good idea.
My assumption is that you will make mremap will return an error if
vma_shareable(). We will then need to document that behavior in the
mremap man page. I 'think' that will require documenting hugetlb pmd
sharing which is not documented anywhere today.

Another option is to 'unshare' early in mremap. However, unshare will
have the same effect as throwing away all the page table entries for the
shared area. So, copying page table entries may be very fast. And, the
first fault on the new vma would theoretically establish sharing again
(assuming all conditions are met). Otherwise, the new vma will not be
populated until pages are faulted in. I know mremap wants to preserve
page tables when it remaps. Does this move us too far from that design
goal?

The last option would be to fully support pmd sharing in the page table
copying code. It is a bit of a pain, but already accounted for in
routines like copy_hugetlb_page_range.

Just some things to consider. I would prefer unsharing or fully
supporting sharing rather than return an error.
--
Mike Kravetz