Re: [PATCH] vfio/pci: make the vfio_pci_mmap_fault reentrant

From: Jason Gunthorpe
Date: Mon Mar 08 2021 - 18:44:45 EST


On Mon, Mar 08, 2021 at 01:21:06PM -0700, Alex Williamson wrote:
> > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> > index 65e7e6b..6928c37 100644
> > +++ b/drivers/vfio/pci/vfio_pci.c
> > @@ -1613,6 +1613,7 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
> > struct vm_area_struct *vma = vmf->vma;
> > struct vfio_pci_device *vdev = vma->vm_private_data;
> > vm_fault_t ret = VM_FAULT_NOPAGE;
> > + unsigned long pfn;
> >
> > mutex_lock(&vdev->vma_lock);
> > down_read(&vdev->memory_lock);
> > @@ -1623,18 +1624,23 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
> > goto up_out;
> > }
> >
> > - if (__vfio_pci_add_vma(vdev, vma)) {
> > - ret = VM_FAULT_OOM;
> > + if (!follow_pfn(vma, vma->vm_start, &pfn)) {
> > mutex_unlock(&vdev->vma_lock);
> > goto up_out;

Gah, no new follow_pfn users please we are trying to delete this
stuff..

I believe the right fix is to change the fault handler to use
vmf_insert_pfn_prot() which has all the right locking/etc

The
> I'm surprised that it's left to the fault handler to provide this
> serialization, is this because we're filling the entire vma rather than
> only the faulting page?

I think it is because remap_pfn is not intended to be called from a
fault handler. The fault handler APIs seem to be named vmf_* ..

If you want to use remap API it has to be done and managed outside the
fault handler. Ie when the MMIO transitions from valid->invalid vfio-pci
wipes the address space, when it transitions from invalid->valid it
calls remap_pfn. vfio-pci provides its own locking to protect these
state transitions. fault simply always triggers sigbus

I recall we discussed this design when you made the original patches
but I don't completely recall why it ended this way, however I think
the reason might disappear after the address_space conversion in your
other series.

Jason