Re: [PATCH v3 1/1] PCI: Add translated request only flag for pci_enable_pasid()

From: Jason Gunthorpe
Date: Tue Jan 31 2023 - 21:36:34 EST


On Tue, Jan 31, 2023 at 06:14:19PM -0600, Bjorn Helgaas wrote:

> > AMD GPU is one of those devices.
>
> I guess you mean the AMD GPU has ATS, PRI, and PASID Capabilities?
> And furthermore, that the GPU *always* uses Translated addresses with
> PASID?

I'm not versed in the spec lingo, but the GPU issues MemRd/Wrs with
the translated bit set and no PASID header - which is the correct form
for an address that was translated by ATS.

To get to that it issues ATS requests, and only the ATS related
requests will carry the PASID.

ATS related requests always route to the root port, which is why it is
functionally equivalent to ACS RR/UF in these cases.

Translated requests always route where they are supposed to go, even
with P2P and things.

> And this applies even if there is no ACS or ACS doesn't support
> PCI_ACS_RR and PCI_ACS_UF.
>
> The black screen happens because ... ?

AMD GPU driver bugs blow up if it cannot setup PASID.

> I couldn't figure out the NULL pointer dereference. I expected it to
> be from a BUG() or similar in report_iommu_fault(), but I don't see
> that.

IIRC it is a buggy error unwind handling in the AMD GPU driver.

Jason