[PATCH 5.19 190/192] iommu: Fix false ownership failure on AMD systems with PASID activated

From: Greg Kroah-Hartman
Date: Tue Sep 13 2022 - 10:30:34 EST


From: Jason Gunthorpe <jgg@xxxxxxxxxx>

commit 2380f1e8195ef612deea1dc7a3d611c5d2b9b56a upstream.

The AMD IOMMU driver cannot activate PASID mode on a RID without the RID's
translation being set to IDENTITY. Further it requires changing the RID's
page table layout from the normal v1 IOMMU_DOMAIN_IDENTITY layout to a
different v2 layout.

It does this by creating a new iommu_domain, configuring that domain for
v2 identity operation and then attaching it to the group, from within the
driver. This logic assumes the group is already set to the IDENTITY domain
and is being used by the DMA API.

However, since the ownership logic is based on the group's domain pointer
equaling the default domain to detect DMA API ownership, this causes it to
look like the group is not attached to the DMA API any more. This blocks
attaching drivers to any other devices in the group.

In a real system this manifests itself as the HD-audio devices on some AMD
platforms losing their device drivers.

Work around this unique behavior of the AMD driver by checking for
equality of IDENTITY domains based on their type, not their pointer
value. This allows the AMD driver to have two IDENTITY domains for
internal purposes without breaking the check.

Have the AMD driver properly declare that the special domain it created is
actually an IDENTITY domain.

Cc: Robin Murphy <robin.murphy@xxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Fixes: 512881eacfa7 ("bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management")
Reported-by: Takashi Iwai <tiwai@xxxxxxx>
Tested-by: Takashi Iwai <tiwai@xxxxxxx>
Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Reviewed-by: Robin Murphy <robin.murphy@xxxxxxx>
Link: https://lore.kernel.org/r/0-v1-ea566e16b06b+811-amd_owner_jgg@xxxxxxxxxx
Signed-off-by: Joerg Roedel <jroedel@xxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
drivers/iommu/amd/iommu_v2.c | 2 ++
drivers/iommu/iommu.c | 21 +++++++++++++++++++--
2 files changed, 21 insertions(+), 2 deletions(-)

--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -786,6 +786,8 @@ int amd_iommu_init_device(struct pci_dev
if (dev_state->domain == NULL)
goto out_free_states;

+ /* See iommu_is_default_domain() */
+ dev_state->domain->type = IOMMU_DOMAIN_IDENTITY;
amd_iommu_domain_direct_map(dev_state->domain);

ret = amd_iommu_domain_enable_v2(dev_state->domain, pasids);
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3089,6 +3089,24 @@ out:
return ret;
}

+static bool iommu_is_default_domain(struct iommu_group *group)
+{
+ if (group->domain == group->default_domain)
+ return true;
+
+ /*
+ * If the default domain was set to identity and it is still an identity
+ * domain then we consider this a pass. This happens because of
+ * amd_iommu_init_device() replacing the default idenytity domain with an
+ * identity domain that has a different configuration for AMDGPU.
+ */
+ if (group->default_domain &&
+ group->default_domain->type == IOMMU_DOMAIN_IDENTITY &&
+ group->domain && group->domain->type == IOMMU_DOMAIN_IDENTITY)
+ return true;
+ return false;
+}
+
/**
* iommu_device_use_default_domain() - Device driver wants to handle device
* DMA through the kernel DMA API.
@@ -3107,8 +3125,7 @@ int iommu_device_use_default_domain(stru

mutex_lock(&group->mutex);
if (group->owner_cnt) {
- if (group->domain != group->default_domain ||
- group->owner) {
+ if (group->owner || !iommu_is_default_domain(group)) {
ret = -EBUSY;
goto unlock_out;
}