RE: [PATCH rc v7 6/6] iommu: Fix UAF in pci_dev_reset_iommu_done() due to concurrent detach
From: Tian, Kevin
Date: Tue Apr 21 2026 - 21:57:06 EST
> From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> Sent: Wednesday, April 22, 2026 2:10 AM
>
> On Tue, Apr 21, 2026 at 07:41:03AM +0000, Tian, Kevin wrote:
> > > From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> > > Sent: Sunday, April 19, 2026 7:42 AM
> > >
> > > @@ -2482,6 +2485,13 @@ static int
> > > __iommu_group_set_domain_internal(struct iommu_group *group,
> > > */
> > > result = 0;
> > > for_each_group_device(group, gdev) {
> > > + /*
> > > + * Device under recovery is attached to group-
> > > >blocking_domain.
> > > + * Don't change that. pci_dev_reset_iommu_done() will re-
> > > attach
> > > + * its domain to the updated group->domain, after the
> > > recovery.
> > > + */
> > > + if (gdev->blocked)
> > > + continue;
> >
> > This reminds me one thing. Ideally the blocked device really doesn't care
> > whether group->domain is the one before resetting or a different one
> > changed in middle. It's blocked then doesn't refer to any non-blocking
> > domains. After reset is done it's re-attached to whatever group->domain
> > is at that time.
> >
> > Then sounds there is no reason to block attach/replace too. Just skip
> > the blocked device and update group->domain then it will be picked up
> > later at reset done, just like done here for detach.
>
> There might be devices in the same group sharing RID?
I don't know what else can be done for it unless we introduce a new
structure to track those devices then apply attach/replace/detach/
reset_preapre/done()/... to all of them. otherwise it's not different from
the problem below.
>
> > Sashiko [1] gave another interesting comment about dma aliasing caused
> > by PCIe to PCI/PCI-X bridge - devices behind the bridge share a same
> > RID (then same device/context entry in IOMMU). In this case unblocking
> > devA could prematurely unblock devB which is actively undergoing a reset.
>
> Exactly. I recall we talked about it when introducing this entire
> reset piece: there was a piece of condition in the reset helpers
> skipping aliasing groups, then we dropped it to focus on singleton
> groups for the first version. Maybe we can resume the discussion,
> but I doubt we could exclude those RID sharing cases...
>
> Nicolin