[PATCH 1/7] EDAC/skx_common: Fix general protection fault

From: Qiuxu Zhuo
Date: Thu Apr 17 2025 - 11:08:59 EST


After loading i10nm_edac (which automatically loads skx_edac_common), if
unload only i10nm_edac, then reload it and perform error injection testing,
a general protection fault may occur:

mce: [Hardware Error]: Machine check events logged
Oops: general protection fault ...
...
Workqueue: events mce_gen_pool_process
RIP: 0010:string+0x53/0xe0
...
Call Trace:
<TASK>
? die_addr+0x37/0x90
? exc_general_protection+0x1e7/0x3f0
? asm_exc_general_protection+0x26/0x30
? string+0x53/0xe0
vsnprintf+0x23e/0x4c0
snprintf+0x4d/0x70
skx_adxl_decode+0x16a/0x330 [skx_edac_common]
skx_mce_check_error.part.0+0xf8/0x220 [skx_edac_common]
skx_mce_check_error+0x17/0x20 [skx_edac_common]
...

The issue arose was because the variable 'adxl_component_count' (inside
skx_edac_common), which counts the ADXL components, was not reset. During
the reloading of i10nm_edac, the count was incremented by the actual number
of ADXL components again, resulting in a count that was double the real
number of ADXL components. This led to an out-of-bounds reference to the
ADXL component array, causing the general protection fault above.

Fix this issue by resetting the 'adxl_component_count' in adxl_put(),
which is called during the unloading of {skx,i10nm}_edac.

Fixes: 123b15863550 ("EDAC, i10nm: make skx_common.o a separate module")
Reported-by: Feng Xu <feng.f.xu@xxxxxxxxx>
Tested-by: Feng Xu <feng.f.xu@xxxxxxxxx>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx>
---
drivers/edac/skx_common.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/edac/skx_common.c b/drivers/edac/skx_common.c
index fa5b442b1844..c9ade45c1a99 100644
--- a/drivers/edac/skx_common.c
+++ b/drivers/edac/skx_common.c
@@ -116,6 +116,7 @@ EXPORT_SYMBOL_GPL(skx_adxl_get);

void skx_adxl_put(void)
{
+ adxl_component_count = 0;
kfree(adxl_values);
kfree(adxl_msg);
}
--
2.43.0