Re: [PATCH v2 3/4] iommufd: Destroy vdevice on idevice destroy

From: Xu Yilun
Date: Wed Jun 25 2025 - 06:14:34 EST


On Tue, Jun 24, 2025 at 11:53:46AM -0300, Jason Gunthorpe wrote:
> On Mon, Jun 23, 2025 at 05:49:45PM +0800, Xu Yilun wrote:
> > +static void iommufd_device_remove_vdev(struct iommufd_device *idev)
> > +{
> > + bool vdev_removing = false;
> > +
> > + mutex_lock(&idev->igroup->lock);
> > + if (idev->vdev) {
> > + struct iommufd_vdevice *vdev;
> > +
> > + vdev = iommufd_get_vdevice(idev->ictx, idev->vdev->obj.id);
> > + if (IS_ERR(vdev)) {
>
> This incrs obj.users which will cause a concurrent
> iommufd_object_remove() to fail with -EBUSY, which we are trying to
> avoid.

I have the same question as Kevin, leave this to that thread.

[...]

> /*
> * We don't know what thread is actually going to destroy the vdev, but
> * once the vdev is destroyed the pointer is NULL'd. At this
> * point idev->users is 0 so no other thread can set a new vdev.
> */
> if (!wait_event_timeout(idev->ictx->destroy_wait,
> !READ_ONCE(idev->vdev),
> msecs_to_jiffies(60000)))
> pr_crit("Time out waiting for iommufd vdevice removed\n");
> }
>
> Though there is a cleaner option here, you could do:
>
> mutex_lock(&idev->igroup->lock);
> if (idev->vdev)
> iommufd_vdevice_abort(&idev->vdev->obj);
> mutex_unlock(&idev->igroup->lock);
>
> And make it safe to call abort twice, eg by setting dev to NULL and
> checking for that. First thread to get to the igroup lock, either via
> iommufd_vdevice_destroy() or via the above will do the actual abort
> synchronously without any wait_event_timeout. That seems better??

I'm good to both options, but slightly tend not to make vdevice so
special from other objects, so still prefer the wait_event option.

>
> > + /* vdev can't outlive idev, vdev->idev is always valid, need no refcnt */
> > + vdev->idev = idev;
>
> So this means a soon as 'idev->vdev = NULL;' happens idev is an
> invalid pointer. Need a WRITE_ONCE there.
>
> I would rephrase the comment as
> iommufd_device_destroy() waits until idev->vdev is NULL before
> freeing the idev, which only happens once the vdev is finished
> destruction. Thus we do not need refcounting on either idev->vdev or
> vdev->idev.
>
> and group both assignments together.

Good to me.

Thanks,
Yilun