Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

From: Jason Gunthorpe
Date: Tue Jul 27 2021 - 15:03:24 EST


On Tue, Jul 27, 2021 at 12:53:09PM -0600, Alex Williamson wrote:
> On Tue, 27 Jul 2021 14:32:09 -0300
> Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
>
> > On Tue, Jul 27, 2021 at 08:04:16AM +0200, Cornelia Huck wrote:
> > > On Mon, Jul 26 2021, Alex Williamson <alex.williamson@xxxxxxxxxx> wrote:
> > >
> > > > On Mon, 26 Jul 2021 20:09:06 -0300
> > > > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> > > >
> > > >> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> > > >>
> > > >> > But I wonder why nobody else implements this? Lack of surprise removal?
> > > >>
> > > >> The only implementation triggers an eventfd that seems to be the same
> > > >> eventfd as the interrupt..
> > > >>
> > > >> Do you know how this works in userspace? I'm surprised that the
> > > >> interrupt eventfd can trigger an observation that the kernel driver
> > > >> wants to be unplugged?
> > > >
> > > > I think we're talking about ccw, but I see QEMU registering separate
> > > > eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> > > > triggering the req_trigger...? Thanks,
> > > >
> > > > Alex
> > >
> > > Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
> > > checks), and this one.
> >
> > If it is a dedicated eventfd for 'device being removed' why is it in
> > the CCW implementation and not core code?
>
> The CCW implementation (likewise the vfio-pci implementation) owns
> the IRQ index address space and the decision to make this a signal
> to userspace rather than perhaps some handling a device might be
> able to do internally.

The core code holds the vfio_device_get() so long as the FD is
open. There is no way to pass the wait_for_completion without
userspace closing the FD, so there isn't really much choice for the
drivers to do beyond signal to userpace to close the FD??

> For instance an alternate vfio-pci implementation might zap all
> mmaps, block all r/w access, and turn this into a surprise removal.

This is nice, but wouldn't close the FD, so needs core changes
anyhow..

> Another implementation might be more aggressive to sending SIGKILL
> to the user process.

We don't try to revoke FDs from the kernel, it is racy, dangerous and
unreliable.

> This was the thought behind why vfio-core triggers the driver
> request callback with a counter, leaving the policy to the driver.

IMHO subsystem policy does not belong in drivers. Down that road lies
a mess for userspace.

Jason