[PATCH v2] x86/mce: fix wrong no-return-ip logic in do_machine_check()

From: Aili Yao
Date: Sun Feb 21 2021 - 22:50:58 EST


>From commit b2f9d678e28c ("x86/mce: Check for faults tagged in
EXTABLE_CLASS_FAULT exception table entries"), When there is a
memory MCE_AR_SEVERITY error with no return ip, Only a SIGBUS
signal is send to current. As the page is not poisoned, the SIGBUS
process's coredump step in kernel will touch the error page again,
which result to a fatal error. We need to poison the page and then
kill current in memory-failure module.

So fix it using the orinigal checking method.

Signed-off-by: Aili Yao <yaoaili@xxxxxxxxxxxx>
---
arch/x86/kernel/cpu/mce/core.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index e133ce1e562b..70380d7d98b3 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1414,7 +1414,10 @@ noinstr void do_machine_check(struct pt_regs *regs)
/* If this triggers there is no way to recover. Die hard. */
BUG_ON(!on_thread_stack() || !user_mode(regs));

- queue_task_work(&m, kill_current_task);
+ if (worst == MCE_AR_SEVERITY)
+ queue_task_work(&m, 0);
+ else if (kill_current_task)
+ queue_task_work(&m, kill_current_task);

} else {
/*
--
2.25.1