Re: [PATCH] drm/amdgpu: disable ASPM for legacy products that don't support ASPM

From: Gong, Richard
Date: Fri Apr 08 2022 - 12:15:53 EST


Hi Alex,

On 4/8/2022 10:54 AM, Alex Deucher wrote:
On Fri, Apr 8, 2022 at 11:47 AM Limonciello, Mario
<Mario.Limonciello@xxxxxxx> wrote:
[Public]



-----Original Message-----
From: Gong, Richard <Richard.Gong@xxxxxxx>
Sent: Friday, April 8, 2022 10:45
To: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Koenig, Christian
<Christian.Koenig@xxxxxxx>; Pan, Xinhui <Xinhui.Pan@xxxxxxx>;
airlied@xxxxxxxx; daniel@xxxxxxxx
Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx; linux-
kernel@xxxxxxxxxxxxxxx; Limonciello, Mario <Mario.Limonciello@xxxxxxx>;
Gong, Richard <Richard.Gong@xxxxxxx>
Subject: [PATCH] drm/amdgpu: disable ASPM for legacy products that don't
support ASPM

Active State Power Management (ASPM) feature is enabled since kernel
5.14.
However there are some legacy products (WX3200 and RX640 are examples)
that
do not support ASPM. Use them as video/display output and system would
hang
during suspend/resume.

Add extra check to disable ASPM for old products that don't have
ASPM support.
The patch description is incorrect. ASPM works just fine on these
GPUs. It's more of an issue with whether the underlying platform
supports ASPM or not. Rather than disabling a chip family, I would
prefer to add a check for problematic platforms and disable ASPM on
those platforms.

I thought that initially.

But I found out that suspend/resume works just fine on the "problematic" platform (Dell Precision 3660, Intel ADL based) + AMD W6400 GFX card. With WX3200 or RX640, suspend/resume works only when ASPM was disabled. Both WX3200 and RX640 are from CHIP_POLARIS12 family.

This is why I take chip family approach.

Regards,

Richard

Alex

Signed-off-by: Richard Gong <richard.gong@xxxxxxx>
Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1885&amp;data=04%7C01%7CRichard.Gong%40amd.com%7C96f8f686f75f43abb5ed08da19780fab%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637850300760921285%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=xVKC0Q16ho5Y2GDuN%2Fnx68wm6NzOIyR5xJbiXPgqPpQ%3D&amp;reserved=0
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index bb1c025d9001..8987107f41ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2012,6 +2012,10 @@ static int amdgpu_pci_probe(struct pci_dev
*pdev,
if (amdgpu_aspm == -1 && !pcie_aspm_enabled(pdev))
amdgpu_aspm = 0;

+ /* disable ASPM for the legacy products that don't support ASPM */
+ if ((flags & AMD_ASIC_MASK) == CHIP_POLARIS12)
+ amdgpu_aspm = 0;
+
I think it's problematic to disable it for the entire driver. There might be multiple
AMDGPUs in the system, and others may support ASPM.

Can it be done just as part of probe for Polaris?

if (amdgpu_virtual_display ||
amdgpu_device_asic_has_dc_support(flags & AMD_ASIC_MASK))
supports_atomic = true;
--
2.25.1