[RFC PATCH 2/3] mm, hugetlb: Update comments in hugetlb_fault
From: Oscar Salvador
Date: Mon Jun 02 2025 - 10:32:42 EST
There are some comments in the hugetlb faulting path that are not
holding anymore because the code has changed.
Most promiment is this one:
/*
* vmf.orig_pte could be a migration/hwpoison vmf.orig_pte at this
* point, so this check prevents the kernel from going below assuming
* that we have an active hugepage in pagecache. This goto expects
* the 2nd page fault, and is_hugetlb_entry_(migration|hwpoisoned)
* check will properly handle it.
*/
This was written because back in the day we used to do:
hugetlb_fault
ptep = huge_pte_offset(...)
if (ptep) {
entry = huge_ptep_get(ptep)
if (unlikely(is_hugetlb_entry_migration(entry))
...
else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
...
...
...
/*
* entry could be a migration/hwpoison entry at this point, so this
* check prevents the kernel from going below assuming that we have
* a active hugepage in pagecache. This goto expects the 2nd page fault,
* and is_hugetlb_entry_(migration|hwpoisoned) check will properly
* handle it.
*/
if (!pte_present(entry))
goto out_mutex;
...
The code was designed to check for hwpoisoned/migration entries upfront,
and then bail out if further down the pte was not present anymore,
relying on taking the second fault so the code from the beginning would
handle it this time for real.
This has changed, so this comment does not hold.
Also, mention in hugetlb_fault() that besides allocation and instantiation,
the mutex also serializes against truncation.
Signed-off-by: Oscar Salvador <osalvador@xxxxxxx>
---
mm/hugetlb.c | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f7bef660ef94..6ef90958839f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6715,9 +6715,8 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
};
/*
- * Serialize hugepage allocation and instantiation, so that we don't
- * get spurious allocation failures if two CPUs race to instantiate
- * the same page in the page cache.
+ * hugetlb_fault_mutex_hash serializes allocation, instantiation and
+ * truncation.
*/
mapping = vma->vm_file->f_mapping;
hash = hugetlb_fault_mutex_hash(mapping, vmf.pgoff);
@@ -6765,11 +6764,8 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
ret = 0;
/*
- * vmf.orig_pte could be a migration/hwpoison vmf.orig_pte at this
- * point, so this check prevents the kernel from going below assuming
- * that we have an active hugepage in pagecache. This goto expects
- * the 2nd page fault, and is_hugetlb_entry_(migration|hwpoisoned)
- * check will properly handle it.
+ * If it is not present, it means we are dealing either with a migration
+ * or hwpoisoned entry.
*/
if (!pte_present(vmf.orig_pte)) {
if (unlikely(is_hugetlb_entry_migration(vmf.orig_pte))) {
@@ -6793,8 +6789,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
* If we are going to COW/unshare the mapping later, we examine the
* pending reservations for this page now. This will ensure that any
* allocations necessary to record that reservation occur outside the
- * spinlock. Also lookup the pagecache page now as it is used to
- * determine if a reservation has been consumed.
+ * spinlock.
*/
if ((flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) &&
!(vma->vm_flags & VM_MAYSHARE) && !huge_pte_write(vmf.orig_pte)) {
--
2.49.0