Re: [PATCH RFC 01/14] mm/memory: drop highest_memmap_pfn sanity check in vm_normal_page()
From: Lance Yang
Date: Thu Jul 03 2025 - 08:35:31 EST
On Mon, Jun 23, 2025 at 10:04 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 20.06.25 14:50, Oscar Salvador wrote:
> > On Tue, Jun 17, 2025 at 05:43:32PM +0200, David Hildenbrand wrote:
> >> In 2009, we converted a VM_BUG_ON(!pfn_valid(pfn)) to the current
> >> highest_memmap_pfn sanity check in commit 22b31eec63e5 ("badpage:
> >> vm_normal_page use print_bad_pte"), because highest_memmap_pfn was
> >> readily available.
> >>
> >> Nowadays, this is the last remaining highest_memmap_pfn user, and this
> >> sanity check is not really triggering ... frequently.
> >>
> >> Let's convert it to VM_WARN_ON_ONCE(!pfn_valid(pfn)), so we can
> >> simplify and get rid of highest_memmap_pfn. Checking for
> >> pfn_to_online_page() might be even better, but it would not handle
> >> ZONE_DEVICE properly.
> >>
> >> Do the same in vm_normal_page_pmd(), where we don't even report a
> >> problem at all ...
> >>
> >> What might be better in the future is having a runtime option like
> >> page-table-check to enable such checks dynamically on-demand. Something
> >> for the future.
> >>
> >> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
> >
>
> Hi Oscar,
>
> > I'm confused, I'm missing something here.
> > Before this change we would return NULL if e.g: pfn > highest_memmap_pfn, but
> > now we just print the warning and call pfn_to_page() anyway.
> > AFAIK, pfn_to_page() doesn't return NULL?
>
> You're missing that vm_normal_page_pmd() was created as a copy from
> vm_normal_page() [history of the sanity check above], but as we don't
> have (and shouldn't have ...) print_bad_pmd(), we made the code look
> like this would be something that can just happen.
>
> "
> Do the same in vm_normal_page_pmd(), where we don't even report a
> problem at all ...
> "
>
> So we made something that should never happen a runtime sanity check
> without ever reporting a problem ...
IIUC, the reasoning is that because this case should never happen, we can
change the behavior from returning NULL to a "warn and continue" model?
Thanks,
Lance