On 06.05.25 07:00, Dev Jain wrote:
To use PTE batching, we want to determine whether the folio mapped by
the PTE is large, thus requiring the use of vm_normal_folio(). We want
to avoid the cost of vm_normal_folio() if the code path doesn't already
require the folio. For arm64, pte_batch_hint() does the job. To generalize
this hint, add a helper which will determine whether two consecutive PTEs
point to consecutive PFNs, in which case there is a high probability that
the underlying folio is large.
Signed-off-by: Dev Jain <dev.jain@xxxxxxx>
---
include/linux/pgtable.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index b50447ef1c92..28e21fcc7837 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -369,6 +369,22 @@ static inline pgd_t pgdp_get(pgd_t *pgdp)
}
#endif
+/* Caller must ensure that ptep + 1 exists */
+static inline bool maybe_contiguous_pte_pfns(pte_t *ptep, pte_t pte)
+{
+ pte_t *next_ptep, next_pte;
+
+ if (pte_batch_hint(ptep, pte) != 1)
+ return true;
+
+ next_ptep = ptep + 1;
+ next_pte = ptep_get(next_ptep);
+ if (!pte_present(next_pte))
+ return false;
+
+ return unlikely(pte_pfn(next_pte) - pte_pfn(pte) == PAGE_SIZE);
+}
So, where we want to use that is:
if (pte_present(old_pte)) {
if ((max_nr != 1) && maybe_contiguous_pte_pfns(old_ptep, old_pte)) {
struct folio *folio = vm_normal_folio(vma, old_addr, old_pte);
if (folio && folio_test_large(folio))
nr = folio_pte_batch(folio, old_addr, old_ptep,
old_pte, max_nr, fpb_flags, NULL, NULL, NULL);
}
}
where we won't need the folio later. But want it all part of the same folio?
And the simpler version would be
if (pte_present(old_pte)) {
if (max_nr != 1) {
struct folio *folio = vm_normal_folio(vma, old_addr, old_pte);
if (folio && folio_test_large(folio))
nr = folio_pte_batch(folio, old_addr, old_ptep,
old_pte, max_nr, fpb_flags, NULL, NULL, NULL);
}
}
Two things come to mind:
(1) Do we *really* care about the vm_normal_folio() + folio_test_large() call that much, that you
have to add this optimization ahead of times ? :)
I am not sure I get you here. folio_pte_batch() seems to be the simplest thing we can do as being done around in the code elsewhere, I am not aware of any alternate.
(2) Do we really need "must be part of the same folio", or could be just batch over present
ptes that map consecutive PFNs? In that case, a helper that avoids folio_pte_batch() completely
might be better.