Re: Kernel and ADM hardware roulette ( was AMD graphics performance regression in 4.15 and later )

From: Christian KÃnig
Date: Thu Jun 07 2018 - 03:07:51 EST


Am 06.06.2018 um 17:44 schrieb Gabriel C:
2018-06-06 17:03 GMT+02:00 Michel DÃnzer <michel@xxxxxxxxxxx>:
On 2018-06-06 04:44 PM, Christian KÃnig wrote:
Am 06.06.2018 um 16:12 schrieb Michel DÃnzer:
[SNIP]
At least in theory it should work when we use the coherent DMA allocator.

When that really worked before, so the most likely commit which broke
this is:

commit fd5fd480dd8fe4910546e7b080b3ae345e57fe9f
Author: Chunming Zhou <david1.zhou@xxxxxxx>
Date: Fri Feb 9 10:44:09 2018 +0800

drm/amdgpu: only enable swiotlb alloc when need v2

get the max io mapping address of system memory to see if it is over
our card accessing range.
v2: move checking later

Signed-off-by: Chunming Zhou <david1.zhou@xxxxxxx>
Reviewed-by: Monk Liu <monk.liu@xxxxxxx>
Reviewed-by: Christian KÃnig <christian.koenig@xxxxxxx>
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>

Currently looking into how we could somehow improve this detection.
I guess this could fit for Gabriel, but e.g.
https://bugs.freedesktop.org/104437 says amdgpu was already broken with
SME in 4.15, if not 4.14 (I suspect there was simply no SME support
earlier).

And what I totally missed is that Gabriel is using radeon and not amdgpu.

So Gabriel you need to revert this one for testing:
commit 1bc3d3cce8c3b44c2b5ac6cee98c830bb40e6b0f
Author: Chunming Zhou <david1.zhou@xxxxxxx>
Date:ÂÂ Fri Feb 9 10:44:10 2018 +0800

ÂÂÂ drm/radeon: only enable swiotlb path when need v2

ÂÂÂ swiotlb expands our card accessing range, but its path always is slower
ÂÂÂ than ttm pool allocation.
ÂÂÂ So add condition to use it.
ÂÂÂ v2: move a bit later

ÂÂÂ Signed-off-by: Chunming Zhou <david1.zhou@xxxxxxx>
ÂÂÂ Reviewed-by: Monk Liu <monk.liu@xxxxxxx>
ÂÂÂ Reviewed-by: Christian KÃnig <christian.koenig@xxxxxxx>
ÂÂÂ Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
ÂÂÂ Link: https://patchwork.freedesktop.org/patch/msgid/20180209024410.1469-3-david1.zhou@xxxxxxx

I got strange performance issue with 4.15 and 4.16 .. but SME was ON
on that setup ( even before it hit mainline ) and never broke the GPU like this.

Well that is very interesting, you are the first one who reports that SME + GFX works in some way. So far we only got negative reports for that.

There is a 4.16.13 boot dmesg which has no such issue:

http://ftp.frugalware.org/pub/other/people/crazy/radeon/dmesg-radeon-SME-ON-kernel-4.16.txt

With the setup as is booting 4.16.x works , while 4.17 trows the errors.

Please do the bisect if the patch I've mentioned above doesn't help.

Thanks,
Christian.



--
Earthling Michel DÃnzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer