Re: [PATCH v2 1/4] mm: Introduce vm_uffd_ops API

From: Liam R. Howlett
Date: Wed Jul 02 2025 - 11:44:26 EST


* Suren Baghdasaryan <surenb@xxxxxxxxxx> [250701 13:04]:
...

> > > + /**
> > > + * uffd_copy: Handler to resolve UFFDIO_COPY|ZEROPAGE request.
> > > + *
> > > + * @dst_pmd: target pmd to resolve page fault
> > > + * @dst_vma: target vma
> > > + * @dst_addr: target virtual address
> > > + * @src_addr: source address to copy from
> > > + * @flags: userfaultfd request flags
> > > + * @foliop: previously allocated folio
> > > + *
> > > + * Return: zero if succeeded, negative for errors.
> >
> > Can you please ensure you put details as to VMA lock state here. Uffd has
> > some very tricky handling around stuff like this.
> >
> > > + */
> > > + int (*uffd_copy)(pmd_t *dst_pmd, struct vm_area_struct *dst_vma,
> > > + unsigned long dst_addr, unsigned long src_addr,
> > > + uffd_flags_t flags, struct folio **foliop);
> >
> > Do we not need a uffd_ctx parameter here?
> >
> > It seems like we're assuming a _lot_ of mm understanding in the underlying
> > driver here.
> >
> > I'm not sure it's really normal to be handing around page table state and
> > folios etc. to a driver like this, this is really... worrying to me.
> >
> > This feels like you're trying to put mm functionality outside of mm?
>
> To second that, two things stick out for me here:
> 1. uffd_copy and uffd_get_folio seem to be at different abstraction
> levels. uffd_copy is almost the entire copy operation for VM_SHARED
> VMAs while uffd_get_folio is a small part of the continue operation.
> 2. shmem_mfill_atomic_pte which becomes uffd_copy for shmem in the
> last patch is quite a complex function which itself calls some IMO
> pretty internal functions like mfill_atomic_install_pte(). Expecting
> modules to implement such functionality seems like a stretch to me

Yes.

I don't think this is a good idea to expose, since there is no way to
restrict it to a specific implementations.

We used to pass out a vma to a driver for updating the vma flags, and
even that proved to be too permissive and it ended very poorly. We had
drivers saving the pointer then dropping the lock, we had drivers
changing things under us. Then there was the fallout in the mm for
trying to deal with what may have happened - and the failure scenarios
of the dealing with it didn't work out.

What this is doing is leading down a path to allow such things at an
even lower level. I think this is too flexible and opens us up to
unintentional abuse.

That is to say, I don't think we should expose this outside of mm for
the benefit of everyone.

> but
> maybe this is for some specialized modules which are written by mm
> experts only?
>

Even with 'experts' (do any of us really know what we're doing,
anyways? ;) we may get into a situation where a company may write
themselves into a corner, depending on specific functionality that's
hard to re-implement and a nightmare to keep working. I don't want to
bind.. er.. to a specific example, but I believe we can all think of at
least one.

Is there more information you can share as to why you want to expose
this functionality? Maybe we can find another way that does not expose
the internals and accomplishes what you want?

Thanks,
Liam