Re: [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled

From: Christian König
Date: Tue Jan 10 2023 - 08:46:12 EST


Am 10.01.23 um 14:25 schrieb Jason Gunthorpe:
On Tue, Jan 10, 2023 at 01:48:39PM +0800, Baolu Lu wrote:
On 2023/1/6 22:14, Jason Gunthorpe wrote:
On Thu, Jan 05, 2023 at 03:57:28PM +0530, Vasant Hegde wrote:
Matt,

On 1/5/2023 6:39 AM, Matt Fagnani wrote:
I built 6.2-rc2 with the patch applied. The same black screen problem happened
with 6.2-rc2 with the patch. I tried to use early kdump with 6.2-rc2 with the
patch twice by panicking the kernel with sysrq+alt+c after the black screen
happened. The system rebooted after about 10-20 seconds both times, but no kdump
and dmesg files were saved in /var/crash. I'm attaching the lspci -vvv output as
requested.

Thanks for testing. As mentioned earlier I was not expecting this patch to fix
the black screen issue. It should fix kernel warnings and IOMMU page fault
related call traces. By any chance do you have the kernel boot logs?


@Baolu,
Looking into lspci output, it doesn't list ACS feature for Graphics card. So
with your fix it didn't enable PASID and hence it failed to boot.
The ACS checks being done are feature of the path not the end point or
root port.

If we are expecting ACS on the end port then it is just a bug in how
the test was written.. The test should be a NOP because there are no
switches in this topology.

Looking at it, this seems to just be because pci_enable_pasid is
calling pci_acs_path_enabled wrong, the only other user is here:

for (bus = pdev->bus; !pci_is_root_bus(bus); bus = bus->parent) {
if (!bus->self)
continue;

if (pci_acs_path_enabled(bus->self, NULL, REQ_ACS_FLAGS))
break;

pdev = bus->self;

group = iommu_group_get(&pdev->dev);
if (group)
return group;
}

And notice it is calling it on pdev->bus not on pdev itself which
naturally excludes the end point from the ACS validation.

So try something like:

if (!pci_acs_path_enabled(pdev->bus->self, NULL, PCI_ACS_RR | PCI_ACS_UF))

(and probably need to check for null ?)
Hi Matt,

Do you mind helping to test below change? No other change needed.

diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index f9cc2e10b676..48f34cc996e4 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -382,8 +382,15 @@ int pci_enable_pasid(struct pci_dev *pdev, int
features)
if (!pasid)
return -EINVAL;

- if (!pci_acs_path_enabled(pdev, NULL, PCI_ACS_RR | PCI_ACS_UF))
- return -EINVAL;
+ if (pdev->multifunction) {
+ if (!pci_acs_path_enabled(pdev, NULL, PCI_ACS_RR |
PCI_ACS_UF))
+ return -EINVAL;
The AMD device is multi-function according to the lspci, and we
already know that 'pci_acs_path_enabled' will fail on it because that
is the problem..

Actually, I remember it is supposed to be like this:

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-iommu%2FYgpb6CxmTdUHiN50%408bytes.org%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7Cb45e8c5a24394d66ae2908daf30e3802%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638089539666187724%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vf9QsDFqp9s1NUxuP5iMsQJn1R0K9tVRTImTR6uZWAE%3D&reserved=0

The GPU and sound device are considered non-isolated by the group
code, presumably because of the missing ACS caps.

So, if I remember the issue, PCIe says that MemWr/Rd are routed
according to their address and ignore the PASID header.

A multifunction device is permitted to loop back DMAs one function
issues that match a MMIO BAR of another function. eg the GPU could DMA
to an MMIO address that overlaps the sound device and the function
will deliver the MMIO to the sound device not the host bridge even
though it is PASID tagged.

This is what get_pci_function_alias_group() is looking for.

Multifunction devices that do not do that are supposed to set the ACS
RR|UF bits and get_pci_function_alias_group()/etc are supposed to
succeed.

Thus - the PCI information is telling us that the AMD GPU device does
not support PASID because it may be looping back the MMIO to the other
functions on the device and thus creating an unacceptable hole in the
PASID address space.

So - we need AMD to comment on which of these describes their GPU device:

1) Is the issue that the PCI Caps are incorrect on this device and
there is no loopback? Thus we should fix it with a quirk to correct
the caps which will naturally split the iommu group too.

2) Is the device broken and loops back PASID DMAs and we are
legimiate and correct in blocking PASID? So far AMD just got lucky
that no user had a SVA that overlaps with MMIO? Seems unlikely

3) Is the device odd in that it doesn't loop back PASID tagged DMAs,
but does loop untagged? I would say this is non-compliant and PCI
provides no way to describe this. But we should again quirk it to
allow the PASID to be enabled but keep the group separated.

Mhm, I don't have a Kabini at hand but I have a Raven and there I see on the GPU:

    Capabilities: [2a0 v1] Access Control Services
        ACSCap:    SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
        ACSCtl:    SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-

    Capabilities: [2b0 v1] Address Translation Service (ATS)
        ATSCap:    Invalidate Queue Depth: 00
        ATSCtl:    Enable+, Smallest Translation Unit: 00

On the bridge:

    Capabilities: [2a0 v1] Access Control Services
        ACSCap:    SrcValid+ TransBlk+ ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
        ACSCtl:    SrcValid+ TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-

And I'm like 99% sure that Kabini/Wani should be identical to that.

Since this is a device integrated in the CPU it could be that the ACS/ATS functionalities are controlled by the BIOS and can be enabled/disabled there. But this should always enable/disable both.

Regards,
Christian.


Alex/Christian/Pan - can you please find out? The HW is:

00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Wani [Radeon R5/R6/R7 Graphics] (rev ca) (prog-if 00 [VGA controller])
DeviceName: ATI EG BROADWAY
Subsystem: Hewlett-Packard Company Device 8332
00:01.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Kabini HDMI/DP Audio
Subsystem: Hewlett-Packard Company Device 8332

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F223ee6d6-70ea-1d53-8bc2-2d22201d8dde%40bell.net%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7Cb45e8c5a24394d66ae2908daf30e3802%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638089539666187724%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=TMB3pXS0eUKPZhcRCvUIxzvJvPYosvv3ofrFKZx7b%2FI%3D&reserved=0

+ } else {
+ if (!pdev->bus->self ||
+ !pci_acs_path_enabled(pdev->bus->self, NULL,
+ PCI_ACS_RR | PCI_ACS_UF))
+ return -EINVAL;
+ }
Why would these be exclusive? Both the path and endpoint needs to be
checked

Jason