Re: [PATCH] mm/huge_memory: fix dereferencing invalid pmd migration entry
From: Hugh Dickins
Date: Thu Apr 17 2025 - 01:30:15 EST
On Tue, 15 Apr 2025, Zi Yan wrote:
>
> Anyway, we need to figure out why both THP migration and deferred_split_scan()
> hold the THP lock first, which sounds impossible to me. Or some other execution
> interleaving is happening.
I think perhaps you're missing that an anon_vma lookup points to a
location which may contain the folio of interest, but might instead
contain another folio: and weeding out those other folios is precisely
what the "folio != pmd_folio((*pmd)" check (and the "risk of replacing
the wrong folio" comment a few lines above it) is for.
The "BUG: unable to handle page fault" comes about because that other
folio might actually be being migrated at this time, so we encounter
a PMD migration entry instead of a valid PMD entry. But if it's the
folio we're looking for, our folio lock excludes a racing migration,
so it would never be a PMD migration entry for our folio.
Hugh