Re: [RFC] mm: khugepaged: use largest enabled hugepage order for min_free_kbytes

From: Usama Arif
Date: Tue Jun 10 2025 - 11:22:46 EST




On 10/06/2025 15:20, Zi Yan wrote:
> On 10 Jun 2025, at 10:03, Lorenzo Stoakes wrote:
>
>> On Mon, Jun 09, 2025 at 03:49:52PM -0400, Zi Yan wrote:
>> [snip]
>>>> I really think a hard cap, expressed in KB/MB, on pageblock size is the way to
>>>> go (but overrideable for people crazy enough to truly want 512 MB pages - and
>>>> who cannot then complain about watermarks).
>>>
>>> I agree. Basically, I am thinking:
>>> 1) use something like 2MB as default pageblock size for all arch (the value can
>>> be set differently if some arch wants a different pageblock size due to other reasons), this can be done by modifying PAGE_BLOCK_MAX_ORDER’s default
>>> value;
>>
>> I don't think we can set this using CONFIG_PAGE_BLOCK_MAX_ORDER.
>>
>> Because the 'order' will be a different size depending on page size obviously.
>>
>> So I'm not sure how this would achieve what we want?
>>
>> It seems to me we should have CONFIG_PAGE_BLOCK_MAX_SIZE_MB or something like
>> this, and we take min(page_size << CONFIG_PAGE_BLOCK_MAX_ORDER,
>> CONFIG_PAGE_BLOCK_MAX_SIZE_MB << 20) as the size.
>
> OK. Now I get what you mean. Yeah, using MB is clearer as user does not
> need to know page size to set the right pageblock size.
>

Just adding it here for completeness, but we could do something like below probably
or use PAGE_SIZE_64KB instead of ARM64_64K_PAGES.
Although it will be messy, as you would then need to do it for every arch and every
page size of that arch.


diff --git a/mm/Kconfig b/mm/Kconfig
index 99910bc649f6..ae83e31ea412 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1023,6 +1023,7 @@ config PAGE_BLOCK_MAX_ORDER
default 10 if ARCH_FORCE_MAX_ORDER = 0
range 1 ARCH_FORCE_MAX_ORDER if ARCH_FORCE_MAX_ORDER != 0
default ARCH_FORCE_MAX_ORDER if ARCH_FORCE_MAX_ORDER != 0
+ default 5 if ARM64_64K_PAGES
help
The page block order refers to the power of two number of pages that
are physically contiguous and can have a migrate type associated to
>>
>>>
>>> 2) make pageblock_order a boot time parameter, so that user who wants
>>> 512MB pages can still get it by changing pageblock order at boot time.
>>>
>>
>> Again, I don't think order is the right choice here, though having it boot time
>> configurable (perhaps overriding the default config there) seems sensible.
>
> Understood. The new pageblock size should be set using MB.
>
>>
>>> WDYT?
>>
>>>
>>>>
>>>>>
>>>>> Often, user just ask for an impossible combination: they
>>>>> want to use all free memory, because they paid for it, and they
>>>>> want THPs, because they want max performance. When PMD THP is
>>>>> small like 2MB, the “unusable” free memory is not that noticeable,
>>>>> but when PMD THP is as large as 512MB, user just cannot unsee it. :)
>>>>
>>>> Well, users asking for crazy things then being surprised when they get them
>>>> is nothing new :P
>>>>
>>>>>
>>>>>
>>>>> Best Regards,
>>>>> Yan, Zi
>>>>
>>>> Thanks for your input!
>>>>
>>>> Cheers, Lorenzo
>>>
>>>
>>> Best Regards,
>>> Yan, Zi
>
>
> Best Regards,
> Yan, Zi