Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA

From: Jason Gunthorpe
Date: Wed Apr 06 2022 - 12:41:20 EST


On Wed, Apr 06, 2022 at 02:37:40PM +0100, Robin Murphy wrote:
> On 2022-04-06 14:06, Jason Gunthorpe wrote:
> > On Wed, Apr 06, 2022 at 01:32:07PM +0100, Robin Murphy wrote:
> > > a particular IOMMU instance, and potentially allocate separate domains for
> > > separate devices to represent the same address space, much like
> > > vfio_iommu_type1_attach_group() does.
> >
> > I think this VFIO code also needs some more work, it currently assumes
> > that if the domain ops are the same then the domains are compatible,
> > and that is not true for ARM SMMU drivers.
>
> Well, strictly it uses the ops as a "negative" heuristic that the domains
> are not definitely incompatible, and only then consolidates domains if
> cross-attaching actually succeeds. So I don't think it's really so bad.

Oh that is sneaky, I didn't appreciate that bit of logic..

> > I've been thinking of adding a domain callback 'can device attach' and
> > replacing the ops compare with that instead.
>
> Previous comment notwithstanding, much as I do tend to prefer "just try the
> operation and see what happens" APIs, that might be more reliable in the
> long run than trying to encode specific "sorry, you'll need to allocate a
> separate domain for this device" vs. "this should have worked but something
> went wrong" semantics in the return value from attach.

Overall the way vfio is doing this has alot of overhead. Not only does
it create a domain it may not use it also does this:

iommu_detach_group(domain->domain, group->iommu_group);
if (!iommu_attach_group(d->domain,
group->iommu_group)) {
list_add(&group->next, &d->group_list);
iommu_domain_free(domain->domain);
kfree(domain);
goto done;
}

ret = iommu_attach_group(domain->domain,
group->iommu_group);

So, if we already have a compatible domain VFIO does an extra domain
alloc/dealloc and 3 extra attach/detatches per domain it tests.

It is not very elegant at least..

> > > It's not really worth IOMMU drivers trying to support a domain spanning
> > > potentially-heterogeneous instances internally, since they can't reasonably
> > > know what matters in any particular situation.
> >
> > In the long run I think it will be worth optimizing. If the SMMU
> > instances can share IOPTE memory then we get two wins - memory
> > reduction and reduced work to read dirty bits.
> >
> > The dirty read in particular is very performance sensitive so if real
> > work loads have many SMMUs per VM it will become a pain point.
>
> In the ideal case though, the SVA domains are only there to logically bridge
> between an existing process pagetable and IOMMU instances at the API level,
> right? Surely if we're shadowing physical pagetables for SVA we've basically
> already lost the performance game?

Sorry, I was not talking about SVA with that remark, speaking
generally about normal iommu_domains and sharing between SMMU
instances.

For SVA I see no issue with duplicating it per instance since it is
very small/etc - the only drawback is that the common code has to do
the 'can device attach' dance and keep a domain list per
mm-struct. Which seems OK to me.

Jason