Re: [PATCH RESEND] mm/pagewalk: split walk_page_range_novma() into kernel/user parts

From: David Hildenbrand
Date: Wed Jun 04 2025 - 04:12:59 EST


On 04.06.25 10:07, Mike Rapoport wrote:
On Wed, Jun 04, 2025 at 09:39:30AM +0200, David Hildenbrand wrote:
On 03.06.25 21:22, Lorenzo Stoakes wrote:
The walk_page_range_novma() function is rather confusing - it supports two
modes, one used often, the other used only for debugging.

The first mode is the common case of traversal of kernel page tables, which
is what nearly all callers use this for.

... and what people should be using it for 🙂


Secondly it provides an unusual debugging interface that allows for the
traversal of page tables in a userland range of memory even for that memory
which is not described by a VMA.

This is highly unusual and it is far from certain that such page tables
should even exist, but perhaps this is precisely why it is useful as a
debugging mechanism.

As a result, this is utilised by ptdump only. Historically, things were
reversed - ptdump was the only user, and other parts of the kernel evolved
to use the kernel page table walking here.

Since we have some complicated and confusing locking rules for the novma
case, it makes sense to separate the two usages into their own functions.

Doing this also provide self-documentation as to the intent of the caller -
are they doing something rather unusual or are they simply doing a standard
kernel page table walk?

We therefore maintain walk_page_range_novma() for this single usage, and
document the function as such.

If we have to keep this dangerous interface, it should probably be

walk_page_range_debug() or walk_page_range_dump()

We can also move it from include/linux/pagewalk.h to mm/internal.h

Agreed.

--
Cheers,

David / dhildenb