Re: [PATCH 1/4] iommu: Add a broken_unmanaged_domain flag in iommu_ops

From: Robin Murphy
Date: Fri Jan 27 2023 - 16:59:33 EST


On 2023-01-27 20:04, Nicolin Chen wrote:
Both IOMMU_DOMAIN_UNMANAGED and IOMMU_DOMAIN_DMA require the support
of __IOMMU_DOMAIN_PAGING capability, i.e. iommu_map/unmap. However,
some older iommu drivers do not fully support that, and these drivers
also do not advertise support for dma-iommu.c via IOMMU_DOMAIN_DMA,
or use arm_iommu_create_mapping(), so largely their implementations
of IOMMU_DOMAIN_UNMANAGED are untested. This means that a user like
vfio/iommufd does not likely work with them.

Several of them have obvious problems:
* fsl_pamu_domain.c
Without map/unmap ops in the default_domain_ops, it isn't an
unmanaged domain at all.
* mtk_iommu_v1.c
With a fixed 4M "pagetable", it can only map exactly 4G of
memory, but doesn't set the aperture.

The aperture is easily fixed (one could argue that what's broken there are the ARM DMA ops for assuming every IOMMU has a 32-bit IOVA space and not checking).

* tegra-gart.c
Its notion of attach/detach and groups has to be a complete lie to
get around all the other API expectations.

That's true, and the domain is tiny and not isolated from the rest of the address space outside the aperture, but the one thing it does do is support iommu_map/unmap just fine, which is what this flag is documented as saying it doesn't.

Some others might work but have never been tested with vfio/iommufd:
* msm_iommu.c
* omap-iommu.c
* tegra-smmu.c

And yet they all have other in-tree users (GPUs on MSM and Tegra, remoteproc on OMAP) that allocate unmanaged domains and use iommu_map/unmap just fine, so they're clearly not broken either.

On the flipside, you're also missing cases like apple-dart, which can have broken unmanaged domains by any definition, but only under certain conditions (at least it "fails safe" and they will refuse attempts to attach anything). I'd also question sprd-iommu, which hardly has a generally-useful domain size, and has only just recently gained the ability to unmap anything successfully. TBH none of the SoC IOMMUs are likely to ever be of interest to VFIO or IOMMUFD, since the only things they could assign to userspace are the individual devices - usually graphics and media engines - that they're coupled to, whose useful functionality tends to depend on clocks, phys, and random other low-level stuff that would be somewhere between impractical and downright unsafe to attempt to somehow expose as well.

Thus, mark all these drivers as having "broken" UNAMANGED domains and
add a new device_iommu_unmanaged_supported() API for vfio/iommufd and
dma-iommu to refuse to work with these drivers.

Co-developed-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Signed-off-by: Nicolin Chen <nicolinc@xxxxxxxxxx>

[...]

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 46e1347bfa22..919a5dbad75b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -245,6 +245,10 @@ struct iommu_iotlb_gather {
* pasid, so that any DMA transactions with this pasid
* will be blocked by the hardware.
* @pgsize_bitmap: bitmap of all possible supported page sizes
+ * @broken_unmanaged_domain: IOMMU_DOMAIN_UNMANAGED is not fully functional; the
+ * driver does not really support iommu_map/unmap, but
+ * uses UNMANAGED domains for the IOMMU API, called by
+ * other SOC drivers.

"uses UNMANAGED domains for the IOMMU API" is literally the definition of unmanaged domains :/

Some "other SOC drivers" use more of the IOMMU API than VFIO does :/

Please just add IOMMU_CAP_IOMMUFD to represent whatever the nebulous requirements of IOMMUFD actually are (frankly it's no less informative than calling domains "broken"), handle that in the drivers you care about and have tested, and use device_iommu_capable(). What you're describing in this series is a capability, and we have a perfectly good API for drivers to express those already. Plus, as demonstrated above, a positive capability based on empirical testing will be infinitely more robust than a negative one based on guessing.

Thanks,
Robin.