Re: [REGRESSION 5.19.x] AMD HD-audio devices missing on 5.19

From: Robin Murphy
Date: Tue Aug 23 2022 - 17:02:24 EST


On 2022-08-23 21:28, Jason Gunthorpe wrote:
On Tue, Aug 23, 2022 at 01:46:36PM +0200, Takashi Iwai wrote:
It was tested now and confirmed that the call path is via AMDGPU, as
expected:
amdgpu_pci_probe ->
amdgpu_driver_load_kms ->
amdgpu_device_init ->
amdgpu_amdkfd_device_init ->
kgd2kfd_device_init ->
kgd2kfd_resume_iommu ->
kfd_iommu_resume ->
amd_iommu_init_device ->
iommu_attach_group ->
__iommu_attach_group

Oh, when you said sound intel I thought this was an Intel CPU..

Yes, there is this hacky private path from the amdgpu to
the amd iommu driver that makes a mess of it here. We discussed it in
this thread:

https://lore.kernel.org/linux-iommu/YgtuJQhY8SNlv9%2F6@xxxxxxxxxx/

But nobody put it together that it would be a problem with this.

Something like this, perhaps, but I didn't check if overriding the
type would cause other problems.

diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index 696d5555be5794..6a1f02c62dffcc 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -777,6 +777,8 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
if (dev_state->domain == NULL)
goto out_free_states;
+ /* See iommu_is_default_domain() */
+ dev_state->domain->type = IOMMU_DOMAIN_IDENTITY;
amd_iommu_domain_direct_map(dev_state->domain);

Same question as 6 months ago, apparently: allocating an unmanaged domain with a pagetable then sucking out the pagetable is silly enough, but if we're going to then also call it a proper identity domain, we should really just allocate an identity domain directly; but then why not just enable_v2 on the identity domain that we know is already there courtesy of def_domain_type?

Robin.


Intentional whitespace-damaged patch based on wrong kernel version to stress how much it is for illustration purposes only:
----->8-----
diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index fb61bdca4c2c..2925103cd71a 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -135,9 +135,6 @@ static void free_device_state(struct device_state *dev_state)

iommu_group_put(group);

- /* Everything is down now, free the IOMMUv2 domain */
- iommu_domain_free(dev_state->domain);
-
/* Finally get rid of the device-state */
kfree(dev_state);
}
@@ -730,7 +727,6 @@ EXPORT_SYMBOL(amd_iommu_unbind_pasid);
int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
{
struct device_state *dev_state;
- struct iommu_group *group;
unsigned long flags;
int ret, tmp;
u16 devid;
@@ -773,34 +769,20 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
if (dev_state->states == NULL)
goto out_free_dev_state;

- dev_state->domain = iommu_domain_alloc(&pci_bus_type);
- if (dev_state->domain == NULL)
+ dev_state->domain = iommu_get_domain_for_dev(&pdev->dev);
+ if (dev_state->domain->type != IOMMU_DOMAIN_IDENTITY)
goto out_free_states;

- amd_iommu_domain_direct_map(dev_state->domain);
-
ret = amd_iommu_domain_enable_v2(dev_state->domain, pasids);
if (ret)
- goto out_free_domain;
-
- group = iommu_group_get(&pdev->dev);
- if (!group) {
- ret = -EINVAL;
- goto out_free_domain;
- }
-
- ret = iommu_attach_group(dev_state->domain, group);
- if (ret != 0)
- goto out_drop_group;
-
- iommu_group_put(group);
+ goto out_free_states;

spin_lock_irqsave(&state_lock, flags);

if (__get_device_state(devid) != NULL) {
spin_unlock_irqrestore(&state_lock, flags);
ret = -EBUSY;
- goto out_free_domain;
+ goto out_free_states;
}

list_add_tail(&dev_state->list, &state_list);
@@ -809,12 +791,6 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)

return 0;

-out_drop_group:
- iommu_group_put(group);
-
-out_free_domain:
- iommu_domain_free(dev_state->domain);
-
out_free_states:
free_page((unsigned long)dev_state->states);