Re: [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled

From: Christian König
Date: Wed Jan 11 2023 - 03:36:30 EST


Hi Matt,

after reading a bit into the topic I think I know what's going on here.

The assumption that you need ACS to enable PASID handling is simply incorrect.

Going to send a revert of the offending patch with an in deep description of the problem.

Thanks,
Christian.

Am 10.01.23 um 21:51 schrieb Matt Fagnani:
Christian,

I'm attaching the output of sudo lspci -vvvv. I'm not sure what $bus_id is in this case. I guess it might be 00 in 00:00.0. I attached the dmesg from previous boots with 6.2-rc1 at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216865%23c2&data=05%7C01%7Cchristian.koenig%40amd.com%7Cc14ca7b3ead040ee279f08daf34c8687%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638089808663927196%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=iFHmme68OeqRpw7zlSPp%2F1mB95DKCR%2FTAsjTcjT6S1s%3D&reserved=0 as I mentioned at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F52583644-d875-a454-7288-8b00ea0566ae%40bell.net%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7Cc14ca7b3ead040ee279f08daf34c8687%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638089808663927196%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=j8ZppuXkhw4dD9HS6OwsvulZaV1R3W8Hu%2BW11%2BxMCuE%3D&reserved=0 and 6.2-rc2 + Vasant's patch with rd.driver.blacklist=amdgpu on the kernel command line at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2Fff26929d-9fb0-3c85-2594-dc2937c1ba9a%40bell.net%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7Cc14ca7b3ead040ee279f08daf34c8687%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638089808663927196%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=i6fxlEn74v86MnFfgCmtYQ2JCql0sVsimZqioBiDyPk%3D&reserved=0 I'm using the Radeon R5 integrated GPU which is called Wani in lspci and Carrizo in dmesg. The CPU is AMD A10-9620P which is Bristol Ridge or Excavator+ according to https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FList_of_AMD_accelerated_processing_units&data=05%7C01%7Cchristian.koenig%40amd.com%7Cc14ca7b3ead040ee279f08daf34c8687%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638089808664083434%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=Ywp7MnbjYMeyXGGNFHOyn2A45IZSLIsShkIPEC4GB48%3D&reserved=0 I'm using the internal Elan touchscreen in the laptop. I'm not using the HDMI port for an external monitor or audio which I think is called Kabini HDMI/DP Audio in lspci

Thanks,

Matt

On 1/10/23 08:56, Christian König wrote:
Am 10.01.23 um 14:51 schrieb Jason Gunthorpe:
On Tue, Jan 10, 2023 at 02:45:30PM +0100, Christian König wrote:

Since this is a device integrated in the CPU it could be that the ACS/ATS
functionalities are controlled by the BIOS and can be enabled/disabled
there. But this should always enable/disable both.
This sounds like a GPU driver bug then, it should tolerate PASID being
unavailable because of BIOS issues/whatever and not black screen on
boot?

Yeah, potentially. Could I get a full "sudo lspci -vvvv -s $bus_id" + dmesg of that device?

Thanks,
Christian.


Jason