Re: [PATCH v2 1/1] mm: fix folio_pte_batch() on XEN PV

From: Andrew Morton
Date: Sat May 03 2025 - 21:29:10 EST


On Fri, 2 May 2025 23:50:19 +0200 Petr Vaněk <arkamar@xxxxxxxx> wrote:

> On XEN PV, folio_pte_batch() can incorrectly batch beyond the end of a
> folio due to a corner case in pte_advance_pfn(). Specifically, when the
> PFN following the folio maps to an invalidated MFN,
>
> expected_pte = pte_advance_pfn(expected_pte, nr);
>
> produces a pte_none(). If the actual next PTE in memory is also
> pte_none(), the pte_same() succeeds,
>
> if (!pte_same(pte, expected_pte))
> break;
>
> the loop is not broken, and batching continues into unrelated memory.
>
> ...

Looks OK for now I guess but it looks like we should pay some attention
to what types we're using.

> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -248,11 +248,9 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr,
> pte_t *start_ptep, pte_t pte, int max_nr, fpb_t flags,
> bool *any_writable, bool *any_young, bool *any_dirty)
> {
> - unsigned long folio_end_pfn = folio_pfn(folio) + folio_nr_pages(folio);
> - const pte_t *end_ptep = start_ptep + max_nr;
> pte_t expected_pte, *ptep;
> bool writable, young, dirty;
> - int nr;
> + int nr, cur_nr;
>
> if (any_writable)
> *any_writable = false;
> @@ -265,11 +263,15 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr,
> VM_WARN_ON_FOLIO(!folio_test_large(folio) || max_nr < 1, folio);
> VM_WARN_ON_FOLIO(page_folio(pfn_to_page(pte_pfn(pte))) != folio, folio);
>
> + /* Limit max_nr to the actual remaining PFNs in the folio we could batch. */
> + max_nr = min_t(unsigned long, max_nr,
> + folio_pfn(folio) + folio_nr_pages(folio) - pte_pfn(pte));
> +

Methinks max_nr really wants to be unsigned long. That will permit the
cleanup of quite a bit of truncation, extension, signedness conversion
and general type chaos in folio_pte_batch()'s various callers.

And...

Why does folio_nr_pages() return a signed quantity? It's a count.

And why the heck is folio_pte_batch() inlined? It's larger then my
first hard disk and it has five callsites!