Re: [PATCH] Revert "mm, hugetlb: remove hugepages_treat_as_movable sysctl"
From: David Hildenbrand
Date: Wed Oct 08 2025 - 04:58:41 EST
On 07.10.25 23:44, Gregory Price wrote:
This reverts commit d6cb41cc44c63492702281b1d329955ca767d399.
This sysctl provides some flexibility between multiple requirements which
are difficult to square without adding significantly more complexity.
1) onlining memory in ZONE_MOVABLE to maintain hotplug compatibility
2) onlining memory in ZONE_MOVABLE to prevent GFP_KERNEL usage
3) passing NUMA structure through to a virtual machine (node0=vnode0,
node1=vnode1) so a guest can make good placement decisions.
4) utilizing 1GB hugepages for VM host memory to reduce TLB pressure
5) Managing device memory after init-time to avoid incidental usage
at boot (due to being placed in ZONE_NORMAL), or to provide users
configuration flexibility.
When device-hotplugged memory does not require hot-unplug assurances,
there is no reason to avoid allowing otherwise non-migratable hugepages
in this zone. This allows for allocation of 1GB gigantic pages for VMs
with existing mechanisms.
Boot-time CMA is not possible for driver-managed hotplug memory, as CMA
requires the memory to be registered as SystemRAM at boot time.
Updated the code to land in appropriate locations since it all moved.
Updated the documentation to add more context when this is useful.
Cc: David Hildenbrand <david@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: Alexandru Moise <00moses.alexander00@xxxxxxxxx>
Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
Suggested-by: David Rientjes <rientjes@xxxxxxxxxx>
Signed-off-by: Gregory Price <gourry@xxxxxxxxxx>
Link: https://lore.kernel.org/all/20180201193132.Hk7vI_xaU%25akpm@xxxxxxxxxxxxxxxxxxxx/
---
Documentation/admin-guide/sysctl/vm.rst | 31 +++++++++++++++++++++++++
include/linux/hugetlb.h | 4 +++-
mm/hugetlb.c | 9 +++++++
3 files changed, 43 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 4d71211fdad8..c9f26cd447d7 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -40,6 +40,7 @@ Currently, these files are in /proc/sys/vm:
- enable_soft_offline
- extfrag_threshold
- highmem_is_dirtyable
+- hugepages_treat_as_movable
- hugetlb_shm_group
- laptop_mode
- legacy_va_layout
@@ -356,6 +357,36 @@ only use the low memory and they can fill it up with dirty data without
any throttling.
+hugepages_treat_as_movable
+==========================
+
+This parameter controls whether otherwise immovable hugepages (e.g. 1GB
+gigantic pages) may be allocated from from ZONE_MOVABLE. If set to non-zero,
+gigantic hugepages can be allocated from ZONE_MOVABLE. ZONE_MOVABLE memory
+may be created via the kernel boot parameter `kernelcore` or via memory
+hotplug as discussed in Documentation/admin-guide/mm/memory-hotplug.rst.
+
+Support may depend on specific architecture and/or the hugepage size. If
+a hugepage supports migration, allocation from ZONE_MOVABLE is always
+enabled (for example 2MB on x86) for the hugepage regardless of the value
+of this parameter. IOW, this parameter affects only non-migratable hugepages.
+
+Assuming that hugepages are not migratable in your system, one usecase of
+this parameter is that users can make hugepage pool more extensible by
+enabling the allocation from ZONE_MOVABLE. This is because on ZONE_MOVABLE
+page reclaim/migration/compaction work more and you can get contiguous
+memory more likely. Note that using ZONE_MOVABLE for non-migratable
+hugepages can do harm to other features like memory hotremove (because
+memory hotremove expects that memory blocks on ZONE_MOVABLE are always
+removable,) so it's a trade-off responsible for the users.
+
+One common use-case of this feature is allocate 1GB gigantic pages for
+virtual machines from otherwise not-hotplugged memory which has been
+isolated from kernel allocations by being onlined into ZONE_MOVABLE.
+These pages tend to be allocated and released more explicitly, and so
+hotplug can still be achieved with appropriate orchestration.
+
+
hugetlb_shm_group
=================
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 526d27e88b3b..bbaa1b4908b6 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -172,6 +172,7 @@ bool hugetlbfs_pagecache_present(struct hstate *h,
struct address_space *hugetlb_folio_mapping_lock_write(struct folio *folio);
+extern int hugepages_treat_as_movable;
extern int sysctl_hugetlb_shm_group;
extern struct list_head huge_boot_pages[MAX_NUMNODES];
@@ -926,7 +927,8 @@ static inline gfp_t htlb_alloc_mask(struct hstate *h)
{
gfp_t gfp = __GFP_COMP | __GFP_NOWARN;
- gfp |= hugepage_movable_supported(h) ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;
+ gfp |= (hugepage_movable_supported(h) || hugepages_treat_as_movable) ?
+ GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;
I mean, this is as ugly as it gets.
Can't we just let that old approach RIP where it belongs? :)
If something unmovable, it does not belong on ZONE_MOVABLE, as simple as that.
Something I could sympathize is is treaing gigantic pages that are actually
migratable as movable.
Like
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 526d27e88b3b2..78da85b1308dd 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -896,37 +896,12 @@ static inline bool hugepage_migration_supported(struct hstate *h)
return arch_hugetlb_migration_supported(h);
}
-/*
- * Movability check is different as compared to migration check.
- * It determines whether or not a huge page should be placed on
- * movable zone or not. Movability of any huge page should be
- * required only if huge page size is supported for migration.
- * There won't be any reason for the huge page to be movable if
- * it is not migratable to start with. Also the size of the huge
- * page should be large enough to be placed under a movable zone
- * and still feasible enough to be migratable. Just the presence
- * in movable zone does not make the migration feasible.
- *
- * So even though large huge page sizes like the gigantic ones
- * are migratable they should not be movable because its not
- * feasible to migrate them from movable zone.
- */
-static inline bool hugepage_movable_supported(struct hstate *h)
-{
- if (!hugepage_migration_supported(h))
- return false;
-
- if (hstate_is_gigantic(h))
- return false;
- return true;
-}
-
/* Movability of hugepages depends on migration support. */
static inline gfp_t htlb_alloc_mask(struct hstate *h)
{
gfp_t gfp = __GFP_COMP | __GFP_NOWARN;
- gfp |= hugepage_movable_supported(h) ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;
+ gfp |= hugepage_migration_supported(h) ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;
return gfp;
}
Assume you want to offline part of the ZONE_MOVABLE there might still be sufficient
space to possibly allocate a 1 GiB area elsewhere and actually move the gigantic page.
IIRC, we do the same for memory offlining already.
Now, maybe we want to make the configurable. But then, I would much rather tweak the
hstate_is_gigantic() check in hugepage_movable_supported(). And the parameter
would need a much better name than some "treat as movable".
--
Cheers
David / dhildenb