Re: [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled

From: Felix Kuehling
Date: Tue Jan 10 2023 - 10:22:57 EST


Am 2023-01-10 um 10:19 schrieb Jason Gunthorpe:
On Tue, Jan 10, 2023 at 10:05:44AM -0500, Felix Kuehling wrote:
Am 2023-01-10 um 08:45 schrieb Christian König:
And I'm like 99% sure that Kabini/Wani should be identical to that.
Kabini is not supported by KFD. There should be no calls to amd_iommu_...
functions on Kabini, at least not from kfd_iommu.c. And I'm not aware of any
other callers in amdgpu.ko.
The backtrace from the system says otherwise..

That log is for Carrizo, not Kabini:

[   13.143970] [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874 >> 0x103C:0x8332 0xCA).
Carrizo is supported by KFD, and it does support ATS/PRI.

Regards,
  Felix



[ 13.515710] amd_iommu_attach_device+0x2e0/0x300
[ 13.515719] __iommu_attach_device+0x1b/0x90
[ 13.515727] iommu_attach_group+0x65/0xa0
[ 13.515735] amd_iommu_init_device+0x16b/0x250 [iommu_v2]
[ 13.515747] kfd_iommu_resume+0x4c/0x1a0 [amdgpu]
[ 13.517094] kgd2kfd_resume_iommu+0x12/0x30 [amdgpu]
[ 13.518419] kgd2kfd_device_init.cold+0x346/0x49a [amdgpu]
[ 13.519699] amdgpu_amdkfd_device_init+0x142/0x1d0 [amdgpu]
[ 13.520877] amdgpu_device_init.cold+0x19f5/0x1e21 [amdgpu]
[ 13.522118] ? _raw_spin_lock_irqsave+0x23/0x50
[ 13.522126] amdgpu_driver_load_kms+0x15/0x110 [amdgpu]
[ 13.523386] amdgpu_pci_probe+0x161/0x370 [amdgpu]
[ 13.524516] local_pci_probe+0x41/0x80
[ 13.524525] pci_device_probe+0xb3/0x220
[ 13.524533] really_probe+0xde/0x380
[ 13.524540] ? pm_runtime_barrier+0x50/0x90
[ 13.524546] __driver_probe_device+0x78/0x170
[ 13.524555] driver_probe_device+0x1f/0x90
[ 13.524560] __driver_attach+0xce/0x1c0
[ 13.524565] ? __pfx___driver_attach+0x10/0x10
[ 13.524570] bus_for_each_dev+0x73/0xa0
[ 13.524575] bus_add_driver+0x1ae/0x200
[ 13.524580] driver_register+0x89/0xe0
[ 13.524586] ? __pfx_init_module+0x10/0x10 [amdgpu]
[ 13.525819] do_one_initcall+0x59/0x230
[ 13.525828] do_init_module+0x4a/0x200
[ 13.525834] __do_sys_init_module+0x157/0x180
[ 13.525839] do_syscall_64+0x5b/0x80
[ 13.525845] ? handle_mm_fault+0xff/0x2f0
[ 13.525850] ? do_user_addr_fault+0x1ef/0x690
[ 13.525856] ? exc_page_fault+0x70/0x170
[ 13.525860] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 13.525867] RIP: 0033:0x7fabd66cde4e
https://lore.kernel.org/all/157c4ca4-370a-5d7e-fe32-c64d934f6979@xxxxxxx/

Jason