Re: [PATCH 1/4] mm: Introduce vm_uffd_ops API

From: David Hildenbrand
Date: Mon Jun 23 2025 - 12:55:00 EST


On 23.06.25 15:59, Peter Xu wrote:
On Mon, Jun 23, 2025 at 10:25:33AM +0200, David Hildenbrand wrote:
On 20.06.25 21:03, Peter Xu wrote:

Hi Peter,

Hey David,


Introduce a generic userfaultfd API for vm_operations_struct, so that one
vma, especially when as a module, can support userfaults without modifying

The sentence is confusing ("vma ... as a module").

Did you mean something like ".. so that a vma that is backed by a
special-purpose in-memory filesystem like shmem or hugetlb can support
userfaultfd without modifying the uffd core; this is required when the
in-memory filesystem is built as a module."

I wanted to avoid mentioning of "in-memory file systems" here.

I thought one of the challenges of supporting guest_memfd on anything that is not a special in-memory file system is also related to how the pagecache handles readahead.

So ...


How about an updated commit like this?

Currently, most of the userfaultfd features are implemented directly in the
core mm. It will invoke VMA specific functions whenever necessary. So far
it is fine because it almost only interacts with shmem and hugetlbfs.

This patch introduces a generic userfaultfd API for vm_operations_struct,
so that any type of file (including kernel modules that can be compiled
separately from the kernel core) can support userfaults without modifying
the core files.

.... is it really "any file" ? I doubt it, but you likely have a better idea on how it all could just work with "any file".


After this API applied, if a module wants to support userfaultfd, the
module should only need to touch its own file and properly define
vm_uffd_ops, instead of changing anything in core mm.

...

Talking about files and modules is still confusing I'm afraid. It's really a special-purpose file (really, not any ordinary files on ordinary filesystems), no?



the core files. More importantly, when the module can be compiled out of
the kernel.

So, instead of having core mm referencing modules that may not ever exist,
we need to have modules opt-in on core mm hooks instead.

After this API applied, if a module wants to support userfaultfd, the
module should only need to touch its own file and properly define
vm_uffd_ops, instead of changing anything in core mm.

Talking about modules that much is a bit confusing. I think this is more
about cleanly supporting in-memory filesystems, without the need to
special-case each and every one of them; can be viewed a cleanup independent
of the module requirement from guest_memfd.

Yes. But if we don't need to support kernel modules actually we don't need
this.. IMHO it's so far really about cleanly support kernel modules, which
can even be out-of-tree (though that's not my purpose of the change..).

Please help check if above updated commit message would be better.

I agree that another special-purpose file (like implemented by guest_memfd) would need that. But if we could get rid of "hugetlb"/"shmem" special-casing in userfaultfd, it would be a rasonable independent cleanup.

But I can spot in patch #3 now:

"Hugetlbfs still has its own hard-coded handler in userfaultfd, due to limitations similar to vm_operations_struct.fault(). TODO: generalize it to use the API function."

I would have hoped that we clean that up in one go instead.




Note that such API will not work for anonymous. Core mm will process
anonymous memory separately for userfault operations like before.

This patch only introduces the API alone so that we can start to move
existing users over but without breaking them.

Currently the uffd_copy() API is almost designed to be the simplistic with
minimum mm changes to move over to the API.


Is there a way to move part of the actual implementation (how this is all
wired up) from patch #4 into this patch, to then only remove the old
shmem/hugetlb hooks (that are effectively unused) in patch #4?

Not much I really removed on the hooks, but I was trying to reuse almost
existing functions. Here hugetlb is almost untouched on hooks, then I
reused the shmem existing function for uffd_copy() rather than removing it
(I did need to remove the definition in the shmem header though becuse it's
not needed to be exported).

The major thing got removed in patch 4 was some random checks over uffd ops
and vma flags. I intentionally made them all in patch 4 to make review
possible. Otherwise it can be slightly awkward to reason what got removed
without knowing what is protecting those checks.

Agreed. It's a shame the new API is not a proper replacement for hugetlb special casing just yet ...

--
Cheers,

David / dhildenb