Re: [PATCH] kvm/vfio: Fix potential deadlock on vfio group_lock

From: Jason Gunthorpe
Date: Tue Jan 31 2023 - 10:15:24 EST


On Tue, Jan 31, 2023 at 10:00:14AM -0500, Matthew Rosato wrote:
> On 1/31/23 9:48 AM, Jason Gunthorpe wrote:
> > On Tue, Jan 31, 2023 at 09:46:18AM -0500, Anthony Krowiak wrote:
> >
> >>> Maybe you should split that lock and have a dedicated apcb lock?
> >>
> >> I don't think that would suffice for taking the vCPUs out of SIE.
> >
> > Then I think we have to keep this patch and also do Matthew's patch to
> > keep kvm refs inside vfio as well.
> >
>
> I don't think keeping kvm refs inside vfio solves this issue though
> -- Even if we handle the kvm_put_kvm asynchronously within vfio as
> previously proposed, kvm_vfio_release will eventually get called and
> it gets called with the kvm->lock already held, then proceeds to
> call vfio_file_set_kvm which gets the group->lock. That order
> conflicts with the hierarchy used by the driver during open_device
> of vfio->group_lock ... kvm->lock.

The group lock is held by vfio_file_set_kvm() only because we don't
have a refcount and we have to hold it across the open call to keep
the pointer alive.

With proper refcounting you'd split this to a spinlock and hold it
only while obtaining the get ref for the open thread.

Jason