Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems

From: Nathan Chancellor
Date: Wed Apr 13 2022 - 11:40:40 EST


Hi Richard,

On Tue, Apr 12, 2022 at 04:50:00PM -0500, Richard Gong wrote:
> Active State Power Management (ASPM) feature is enabled since kernel 5.14.
> There are some AMD GFX cards (such as WX3200 and RX640) that won't work
> with ASPM-enabled Intel Alder Lake based systems. Using these GFX cards as
> video/display output, Intel Alder Lake based systems will hang during
> suspend/resume.
>
> The issue was initially reported on one system (Dell Precision 3660 with
> BIOS version 0.14.81), but was later confirmed to affect at least 4 Alder
> Lake based systems.
>
> Add extra check to disable ASPM on Intel Alder Lake based systems.
>
> Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
> Reported-by: kernel test robot <lkp@xxxxxxxxx>
> Signed-off-by: Richard Gong <richard.gong@xxxxxxx>
> ---
> v4: s/CONFIG_X86_64/CONFIG_X86
> enhanced check logic
> v3: s/intel_core_asom_chk/aspm_support_quirk_check
> correct build error with W=1 option
> v2: correct commit description
> move the check from chip family to problematic platform
> ---
> drivers/gpu/drm/amd/amdgpu/vi.c | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
> index 039b90cdc3bc..b33e0a9bee65 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
> @@ -81,6 +81,10 @@
> #include "mxgpu_vi.h"
> #include "amdgpu_dm.h"
>
> +#if IS_ENABLED(CONFIG_X86)
> +#include <asm/intel-family.h>
> +#endif
> +
> #define ixPCIE_LC_L1_PM_SUBSTATE 0x100100C6
> #define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK 0x00000001L
> #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK 0x00000002L
> @@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct amdgpu_device *adev)
> WREG32_PCIE(ixPCIE_LC_CNTL, data);
> }
>
> +static bool aspm_support_quirk_check(void)
> +{
> + if (IS_ENABLED(CONFIG_X86)) {
> + struct cpuinfo_x86 *c = &cpu_data(0);
> +
> + return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
> + }

I have not seen this reported by a bot, sorry if it is a duplicate. This
breaks non-x86 builds (arm64 allmodconfig for example):

drivers/gpu/drm/amd/amdgpu/vi.c:1144:28: error: implicit declaration of function 'cpu_data' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
struct cpuinfo_x86 *c = &cpu_data(0);
^
drivers/gpu/drm/amd/amdgpu/vi.c:1144:27: error: cannot take the address of an rvalue of type 'int'
struct cpuinfo_x86 *c = &cpu_data(0);
^~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/vi.c:1146:13: error: incomplete definition of type 'struct cpuinfo_x86'
return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
~^
drivers/gpu/drm/amd/amdgpu/vi.c:1144:10: note: forward declaration of 'struct cpuinfo_x86'
struct cpuinfo_x86 *c = &cpu_data(0);
^
drivers/gpu/drm/amd/amdgpu/vi.c:1146:28: error: incomplete definition of type 'struct cpuinfo_x86'
return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
~^
drivers/gpu/drm/amd/amdgpu/vi.c:1144:10: note: forward declaration of 'struct cpuinfo_x86'
struct cpuinfo_x86 *c = &cpu_data(0);
^
drivers/gpu/drm/amd/amdgpu/vi.c:1146:43: error: use of undeclared identifier 'INTEL_FAM6_ALDERLAKE'
return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
^
5 errors generated.

'struct cpuinfo_x86' is only defined for CONFIG_X86 so this section
needs to guarded with the preprocessor, which is how it was done in v2.
Please go back to that.

Cheers,
Nathan