Re: [PATCH 1/2] x86: mce: kdump: use under_crashdumping to turn off MCE in all CPUs together

From: Naoya Horiguchi
Date: Mon Feb 23 2015 - 08:02:00 EST


# I resend this, sorry if you receive this twice.

On Mon, Feb 23, 2015 at 10:27:39AM +0100, Borislav Petkov wrote:
On Mon, Feb 23, 2015 at 09:12:29AM +0000, Naoya Horiguchi wrote:
> kexec disables (or "shoots down") all CPUs other than a crashing CPU before
> entering the 2nd kernel. This disablement is done via NMI, and the crashing
> CPU wait for the completions by spinning at most for 1 second.
> However, there is a race window if this NMI handling doesn't complete within
> the 1 second on some CPU, which cause the fragile situation where only a
> portion of online CPUs are responsive to MCE interrupt. If MCE happens during
> this race window, MCE synchronization always timeouts and results in kernel
> panic. So the user-visible effect of this bug is kdump failure.
>
> Note that this race window did exist when current MCE handler was implemented
> around 2.6.32, and recently commit 716079f66eac ("mce: Panic when a core has
> reached a timeout") made it more visible by changing the default behavior of
> the synchronization timeout from "ignore" to "panic".

Let me guess: you could raise the tolerance level to 3 temporarily from
native_machine_crash_shutdown() and not touch the #MC handler at all,
right?

Yes, that can be a right solution for fixing the kdump failure itself, but I
think that it might not be the best solution from the viewpoint of messaging to
userspace. What end users see is like these timeout messages:
- "Timeout: Not all CPUs entered broadcast exception handler",
- "Timeout: Subject CPUs unable to finish machine check processing",
- "Timeout: Monarch CPU unable to finish machine check processing", or
- "Timeout: Monarch CPU did not finish machine check processing".
These are informative for developers like us, but confusing for end users.
If we can guess that what end users want to know is whether the kdump is
reliable or not, so "Machine Check ignored because crash dump is running."
sounds a bit better to me.

But yes, I agree that using mca_cfg->tolerant is a nice idea, so I'd like to
define another value to show that kdump is running. Does it make sense to you?

Thanks,
Naoya Horiguchi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/