Re: [PATCH -v2] x86/boot/compressed: Register dummy NMI handler in EFI boot loader, to avoid kdump crashes

From: Borislav Petkov
Date: Tue Jan 10 2023 - 07:57:16 EST


On Tue, Jan 10, 2023 at 08:32:07PM +0800, Zeng Heng wrote:
> mce is registered on NMI handler by inject_init().

That's a handler for the NMI raised by raise_mce(). That's for the injection
case, which is simulated. If you're fixing the injection case, then surely not
with a bogus boot NMI handler.

> Yes, exactly. The following procedure is like:
>
> panic() -> relocate_kernel() -> identity_mapped() -> x86 purgatory image ->
> EFI loader -> secondary kernel

I'm doubtful now as you're injecting errors so you're not really in #MC context
but in this contrived context which is actually an NMI one. So we need to think
about how to fix this case.

Certainly not with an empty NMI handler...

Regardless, we should do

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 7832a69d170e..57fe376ed049 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -286,6 +286,8 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp)
if (!fake_panic) {
if (panic_timeout == 0)
panic_timeout = mca_cfg.panic_timeout;
+
+ mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
panic(msg);
} else
pr_emerg(HW_ERR "Fake kernel panic: %s\n", msg);

so that we not run kexec in #MC context.

Hmmm.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette