RE: [PATCH v2 02/10] iommu: Introduce a new iommu_group_replace_domain() API

From: Tian, Kevin
Date: Tue Feb 21 2023 - 21:12:03 EST


> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Wednesday, February 15, 2023 8:53 PM
>
> On Wed, Feb 15, 2023 at 06:10:47AM +0000, Tian, Kevin wrote:
> > > From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> > > Sent: Wednesday, February 8, 2023 5:18 AM
> > >
> > > +int iommu_group_replace_domain(struct iommu_group *group,
> > > + struct iommu_domain *new_domain)
> > > +{
> > > + int ret;
> > > +
> > > + if (!new_domain)
> > > + return -EINVAL;
> > > +
> > > + mutex_lock(&group->mutex);
> > > + ret = __iommu_group_set_domain(group, new_domain);
> > > + if (ret)
> > > + __iommu_group_set_domain(group, group->domain);
> >
> > Just realize the error unwind is a nop given below:
> >
> > __iommu_group_set_domain()
> > {
> > if (group->domain == new_domain)
> > return 0;
> >
> > ...
> >
> > There was an attempt [1] to fix error unwind in iommu_attach_group(), by
> > temporarily set group->domain to NULL before calling set_domain().
> >
> > Jason, I wonder why this recovering cannot be done in
> > __iommu_group_set_domain() directly, e.g.:
> >
> > ret = __iommu_group_for_each_dev(group, new_domain,
> > iommu_group_do_attach_device);
> > if (ret) {
> > __iommu_group_for_each_dev(group, group->domain,
> > iommu_group_do_attach_device);
> > return ret;
> > }
> > group->domain = new_domain;
>
> We talked about this already, some times this is not the correct
> recovery case, eg if we are going to a blocking domain we need to drop
> all references to the prior domain, not put them back.
>
> Failures are WARN_ON events not error recovery.
>

OK, I remember that. Then here looks we also need temporarily
set group->domain to NULL before calling set_domain() to recover,
as [1] does.

[1] https://lore.kernel.org/linux-iommu/20230215052642.6016-1-vasant.hegde@xxxxxxx/