Re: [PATCH v2] mm, hugepages: add mremap() support for hugepage backed vma

From: Vlastimil Babka
Date: Tue Aug 24 2021 - 11:33:13 EST


On 8/19/21 01:35, Mina Almasry wrote:
> On Fri, Aug 13, 2021 at 4:40 PM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote:
>>
>> Adding Michal, Vlastimil and Kirill as adding this type of support
>> was discussed in this thread:
>> https://lore.kernel.org/linux-mm/3ba05809-63a2-2969-e54f-fd0202fe336b@xxxxxx/
>>
>
> Ack. I haven't gone through that thread but if there are any concerns
> with what I whipped up here, do let me know.

Looks like there were mostly concerns with vma's growing and you're not handling
that, so that's fine?

>> On 8/10/21 6:10 PM, Mina Almasry wrote:
>> > Support mremap() for hugepage backed vma segment by simply repositioning
>> > page table entries. The page table entries are repositioned to the new
>> > virtual address on mremap().
>> >
>> > Hugetlb mremap() support is of course generic; my motivating use case
>> > is a library (hugepage_text), which reloads the ELF text of executables
>> > in hugepages. This significantly increases the execution performance of
>> > said executables.

I think this scenario has been motivation for FS support of THP, but AFAIK we're
not there yet. Also IIRC Google had some library that did what you describe, but
using THP's instead of hugetlbfs - might be simpler as there are no reservations
involved.

But nothing against improving mremap() support wrt hugetlbfs.

> I'll yield to whatever you decide here because I reckon you have much
> more experience and better judgement here. But my thoughts:
>
> 'Sane' usage of mremap() is something like:
> 1. mmap() a hugetlbfs vma.
> 2. Pass the vma received from step (1) to mremap() to remap it to a
> different location.
>
> I don't know if there is another usage pattern I need to worry about
> but given the above, old_addr and old_len will be hugepage aligned
> already since they are values returned by the previous mmap() call
> which aligns them, no? So, I think aligning old_addr and old_len to
> the hugepage boundary is fine.
>
> With this support we don't allow mremap() expansion. In my use case
> old_len==new_len acutally. I think it's fine to also align new_len to
> the hugepage boundary
>
> I already have this code that errors out if the lengths are not aligned:
>
> if (old_len & ~huge_page_mask(h) || new_len & ~huge_page_mask(h))
> goto out;
>
> I think aligning new_addr breaks my use case though. In my use case
> new_addr is the start of the text segment in the ELF executable, and I
> don't think that's guaranteed to be anything but page aligned.

Hm, I have a vague (possibly wrong) recollection that Andrea mentioned he always
planned text to be in THPs so he made sure ELF text sections are aligned as
such. I guess there's a way to declare alignment in ELF and it depends on
whether your distro's linker is set up to ask for hugepage-sized one?

> Aligning new_addr seems like it would break my use case. If you insist
> though I'm happy aligning new_addr in the upstream kernel and not
> doing that in our kernel, but if I'm not particularly happy with the
> hugepage alignment I'd say it is likely future users of hugetlb
> mremap() also won't like the hugepage alignement, but I yield to you
> here.