Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

From: Zi Yan
Date: Fri Nov 09 2018 - 10:34:14 EST


On 9 Nov 2018, at 8:11, Mel Gorman wrote:

> On Fri, Nov 09, 2018 at 03:13:18PM +0300, Kirill A. Shutemov wrote:
>> On Thu, Nov 08, 2018 at 10:48:58PM -0800, Anthony Yznaga wrote:
>>> The basic idea as outlined by Mel Gorman in [2] is:
>>>
>>> 1) On first fault in a sufficiently sized range, allocate a huge page
>>> sized and aligned block of base pages. Map the base page
>>> corresponding to the fault address and hold the rest of the pages in
>>> reserve.
>>> 2) On subsequent faults in the range, map the pages from the reservation.
>>> 3) When enough pages have been mapped, promote the mapped pages and
>>> remaining pages in the reservation to a huge page.
>>> 4) When there is memory pressure, release the unused pages from their
>>> reservations.
>>
>> I haven't yet read the patch in details, but I'm skeptical about the
>> approach in general for few reasons:
>>
>> - PTE page table retracting to replace it with huge PMD entry requires
>> down_write(mmap_sem). It makes the approach not practical for many
>> multi-threaded workloads.
>>
>> I don't see a way to avoid exclusive lock here. I will be glad to
>> be proved otherwise.
>>
>
> That problem is somewhat fundamental to the mmap_sem itself and
> conceivably it could be alleviated by range-locking (if that gets
> completed). The other thing to bear in mind is the timing. If the
> promotion is in-place due to reservations, there isn't the allocation
> overhead and the hold times *should* be short.
>

Is it possible to convert all these PTEs to migration entries during
the promotion and replace them with a huge PMD entry afterwards?
AFAIK, migrating pages does not require holding a mmap_sem.
Basically, it will act like migrating 512 base pages to a THP without
actually doing the page copy.

--
Best Regards
Yan Zi

Attachment: signature.asc
Description: OpenPGP digital signature