Re: [PATCH RFC 01/14] mm/memory: drop highest_memmap_pfn sanity check in vm_normal_page()

From: David Hildenbrand
Date: Thu Jul 03 2025 - 08:40:12 EST


On 03.07.25 14:34, Lance Yang wrote:
On Mon, Jun 23, 2025 at 10:04 PM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 20.06.25 14:50, Oscar Salvador wrote:
On Tue, Jun 17, 2025 at 05:43:32PM +0200, David Hildenbrand wrote:
In 2009, we converted a VM_BUG_ON(!pfn_valid(pfn)) to the current
highest_memmap_pfn sanity check in commit 22b31eec63e5 ("badpage:
vm_normal_page use print_bad_pte"), because highest_memmap_pfn was
readily available.

Nowadays, this is the last remaining highest_memmap_pfn user, and this
sanity check is not really triggering ... frequently.

Let's convert it to VM_WARN_ON_ONCE(!pfn_valid(pfn)), so we can
simplify and get rid of highest_memmap_pfn. Checking for
pfn_to_online_page() might be even better, but it would not handle
ZONE_DEVICE properly.

Do the same in vm_normal_page_pmd(), where we don't even report a
problem at all ...

What might be better in the future is having a runtime option like
page-table-check to enable such checks dynamically on-demand. Something
for the future.

Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>


Hi Oscar,

I'm confused, I'm missing something here.
Before this change we would return NULL if e.g: pfn > highest_memmap_pfn, but
now we just print the warning and call pfn_to_page() anyway.
AFAIK, pfn_to_page() doesn't return NULL?

You're missing that vm_normal_page_pmd() was created as a copy from
vm_normal_page() [history of the sanity check above], but as we don't
have (and shouldn't have ...) print_bad_pmd(), we made the code look
like this would be something that can just happen.

"
Do the same in vm_normal_page_pmd(), where we don't even report a
problem at all ...
"

So we made something that should never happen a runtime sanity check
without ever reporting a problem ...

IIUC, the reasoning is that because this case should never happen, we can
change the behavior from returning NULL to a "warn and continue" model?

Well, yes. Point is, that check should have never been copy-pasted that way, while dropping the actual warning :)

It's a sanity check for something that should never happen, turned into something that looks like it must be handled and would be valid to encounter.

--
Cheers,

David / dhildenb