Re: Error in amd driver?

From: Alex Deucher
Date: Mon May 06 2024 - 11:07:50 EST


On Mon, May 6, 2024 at 6:00 AM Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> + amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>
> On Sun, May 05, 2024 at 09:59:22PM +0300, Tranton Baddy wrote:
> > I have this in my dmesg since version 6.8.6, not sure when it appeared. Is amdgpu driver has bug?

Should be fixed in:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d3a9331a6591e9df64791e076f6591f440af51c3

Alex

> > [ 64.253144] ==================================================================
> > [ 64.253162] BUG: KFENCE: use-after-free read in amdgpu_bo_move+0x51f/0x7a0
> >
> > [ 64.253183] Use-after-free read at 0x00000000671c48dd (in kfence-#111):
> > [ 64.253192] amdgpu_bo_move+0x51f/0x7a0
> > [ 64.253202] ttm_bo_handle_move_mem+0xcf/0x180
> > [ 64.253211] ttm_mem_evict_first+0x1c5/0x500
> > [ 64.253218] ttm_resource_manager_evict_all+0xa3/0x1e0
> > [ 64.253228] amdgpu_device_prepare+0x66/0x110
> > [ 64.253237] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
> > [ 64.253248] pci_pm_runtime_suspend+0x74/0x200
> > [ 64.253259] vga_switcheroo_runtime_suspend+0x21/0xb0
> > [ 64.253268] __rpm_callback+0x5f/0x190
> > [ 64.253277] rpm_callback+0x7f/0x90
> > [ 64.253283] rpm_suspend+0x120/0x6a0
> > [ 64.253290] pm_runtime_work+0x9c/0xa0
> > [ 64.253297] process_one_work+0x164/0x330
> > [ 64.253310] worker_thread+0x302/0x430
> > [ 64.253320] kthread+0xe4/0x110
> > [ 64.253329] ret_from_fork+0x4c/0x60
> > [ 64.253341] ret_from_fork_asm+0x1b/0x30
> >
> > [ 64.253353] kfence-#111: 0x00000000d018cf03-0x0000000034e821d1, size=96, cache=kmalloc-96
> >
> > [ 64.253363] allocated by task 152 on cpu 3 at 64.248952s:
> > [ 64.253418] kmalloc_trace+0x283/0x340
> > [ 64.253427] amdgpu_vram_mgr_new+0x8f/0x3f0
> > [ 64.253435] ttm_resource_alloc+0x39/0x90
> > [ 64.253444] ttm_bo_mem_space+0xa4/0x260
> > [ 64.253450] ttm_mem_evict_first+0x18a/0x500
> > [ 64.253456] ttm_resource_manager_evict_all+0xa3/0x1e0
> > [ 64.253465] amdgpu_device_prepare+0x66/0x110
> > [ 64.253472] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
> > [ 64.253481] pci_pm_runtime_suspend+0x74/0x200
> > [ 64.253489] vga_switcheroo_runtime_suspend+0x21/0xb0
> > [ 64.253496] __rpm_callback+0x5f/0x190
> > [ 64.253503] rpm_callback+0x7f/0x90
> > [ 64.253509] rpm_suspend+0x120/0x6a0
> > [ 64.253516] pm_runtime_work+0x9c/0xa0
> > [ 64.253523] process_one_work+0x164/0x330
> > [ 64.253532] worker_thread+0x302/0x430
> > [ 64.253542] kthread+0xe4/0x110
> > [ 64.253550] ret_from_fork+0x4c/0x60
> > [ 64.253559] ret_from_fork_asm+0x1b/0x30
> >
> > [ 64.253570] freed by task 152 on cpu 3 at 64.253117s:
> > [ 64.253582] ttm_resource_free+0x67/0x90
> > [ 64.253591] ttm_bo_move_accel_cleanup+0x247/0x2e0
> > [ 64.253598] amdgpu_bo_move+0x1bd/0x7a0
> > [ 64.253605] ttm_bo_handle_move_mem+0xcf/0x180
> > [ 64.253612] ttm_mem_evict_first+0x1c5/0x500
> > [ 64.253618] ttm_resource_manager_evict_all+0xa3/0x1e0
> > [ 64.253626] amdgpu_device_prepare+0x66/0x110
> > [ 64.253634] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
> > [ 64.253642] pci_pm_runtime_suspend+0x74/0x200
> > [ 64.253650] vga_switcheroo_runtime_suspend+0x21/0xb0
> > [ 64.253658] __rpm_callback+0x5f/0x190
> > [ 64.253664] rpm_callback+0x7f/0x90
> > [ 64.253671] rpm_suspend+0x120/0x6a0
> > [ 64.253677] pm_runtime_work+0x9c/0xa0
> > [ 64.253684] process_one_work+0x164/0x330
> > [ 64.253693] worker_thread+0x302/0x430
> > [ 64.253703] kthread+0xe4/0x110
> > [ 64.253711] ret_from_fork+0x4c/0x60
> > [ 64.253723] ret_from_fork_asm+0x1b/0x30
> >
> > [ 64.253735] CPU: 3 PID: 152 Comm: kworker/3:2 Tainted: P OE 6.8.9 #3 e7323d0d25f89e853881fc823e59523bdcc577c6
> > [ 64.253756] Hardware name: Hewlett-Packard HP Pavilion Notebook /80B9, BIOS F.54 05/27/2019
> > [ 64.253761] Workqueue: pm pm_runtime_work
> > [ 64.253771] ==================================================================
> >
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette