Re: [PATCH v2 08/10] iommufd/device: Use iommu_group_replace_domain()

From: Nicolin Chen
Date: Tue Feb 14 2023 - 20:58:56 EST


On Wed, Feb 15, 2023 at 01:37:19AM +0000, Tian, Kevin wrote:

> > From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> > Sent: Tuesday, February 14, 2023 6:54 PM
> >
> > On Mon, Feb 13, 2023 at 10:49:34AM -0400, Jason Gunthorpe wrote:
> > > On Sun, Feb 12, 2023 at 11:48:30PM -0800, Nicolin Chen wrote:
> > >
> > > > What about point 1? If dev2 and dev3 are already replaced when
> > > > doing iommu_group_replace_domain() on dev1, their idev objects
> > > > still have the old hwpt/iopt until user space does another two
> > > > IOCTLs on them, right?
> > >
> > > We have a complicated model for multi-device groups...
> > >
> > > The first device in the group to change domains must move all the
> > > devices in the group
> > >
> > > But userspace is still expected to run through and change all the
> > > other devices
> > >
> > > So replace should be a NOP if the group is already linked to the right
> > > domain.
> > >
> > > This patch needs to make sure that incosistency in the view betwen the
> > > iommu_group and the iommufd model doesn't cause a functional
> > > problem.
> >
> > Yea, I was thinking that we'd need to block any access to the
> > idev->hwpt of a pending device's, before the kernel finishes
> > the "NOP" IOCTL from userspace, maybe with a helper:
> > (iommu_get_domain_for_dev(idev->dev) != idev->hwpt->domain)
> >
>
> I didn't see what would be broken w/o such blocking measure.
>
> Can you elaborate?

iommu_group { idev1, idev2 }

(1) Attach all devices to domain1 {
group->domain = domain1;
idev1->hwpt = hwpt1; // domain1
idev2->hwpt = hwpt1; // domain1
}

(2) Attach (replace) idev1 only to domain2 {
group->domain = domain2
idev1->hwpt = hwpt2; // domain2
idev2->hwpt == domain1 // != iommu_get_domain_for_dev()
}

Then if user space isn't aware of these and continues to do
IOMMU_IOAS_MAP for idev2. Though IOVA mappings may be added
onto the domain2 correctly, yet domain2's iopt itree won't
reflect that, until idev2->hwpt is updated too, right?

And the tricky thing is that, though we advocate a device-
centric uAPI, we'd still have to ask user space to align the
devices in the same iommu_group, which is also not present
in QEMU's RFC v3 series.

The traditional detach+attach flow doesn't seem to have this
issue, since there's no re-entry so the work flow is always
that detaching all devices first before attaching to another
domain.

Thanks
Nic