Re: [PATCH 5/5] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings

From: Jason Gunthorpe
Date: Mon Jun 30 2025 - 10:05:56 EST


On Wed, Jun 25, 2025 at 03:26:44PM -0400, Peter Xu wrote:
> On Wed, Jun 25, 2025 at 03:41:54PM -0300, Jason Gunthorpe wrote:
> > On Wed, Jun 25, 2025 at 01:12:11PM -0400, Peter Xu wrote:
> >
> > > After I read the two use cases, I mostly agree. Just one trivial thing to
> > > mention, it may not be direct map but vmap() (see io_region_init_ptr()).
> >
> > If it is vmapped then this is all silly, you should vmap and mmmap
> > using the same cache colouring and, AFAIK, pgoff is how this works for
> > purely userspace.
> >
> > Once vmap'd it should determine the cache colour and set the pgoff
> > properly, then everything should already work no?
>
> I don't yet see how to set the pgoff. Here pgoff is passed from the
> userspace, which follows io_uring's definition (per io_uring_mmap).

That's too bad

So you have to do it the other way and pass the pgoff to the vmap so
the vmap ends up with the same colouring as a user VMa holding the
same pages..

> So if we want the new API to be proposed here, and make VFIO use it first
> (while consider it to be applicable to all existing MMU users at least,
> which I checked all of them so far now), I'd think this proper:
>
> int (*mmap_va_hint)(struct file *file, unsigned long *pgoff, size_t len);
>
> The changes comparing to previous:
>
> (1) merged pgoff and *phys_pgoff parameters into one unsigned long, so
> the hook can adjust the pgoff for the va allocator to be used. The
> adjustment will not be visible to future mmap() when VMA is created.

It seems functional, but the above is better, IMHO.

> (2) I renamed it to mmap_va_hint(), because *pgoff will be able to be
> updated, so it's not only about ordering, but "order" and "pgoff
> adjustment" hints that the core mm will use when calculating the VA.

Where does order come back though? Returns order?

It seems viable

Jason