Re: [PATCH] x86/mce: Fix initialization error warning

From: Prarit Bhargava
Date: Mon Jan 16 2017 - 18:16:24 EST




On 01/16/2017 05:43 PM, Borislav Petkov wrote:
> On Mon, Jan 16, 2017 at 05:06:02PM -0500, Prarit Bhargava wrote:
>> Yes, it was loud enough to generate a bug report from a user.
>
> Yeah, because all users are sane and we should do whatever they want -
> no questions asked. Especially those who boot with "mce=off".
>
> Did you actually ask that user why she/he is even booting with
> "mce=off"?

Yes, mce=off is the default for kdump:

KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices cgroup_disable=memory
mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail
acpi_no_memhotplug transparent_hugepage=never"

There is a race condition between NMI completing on a CPU and the MCE
synchronization timing out that results in a kernel panic on the kdump kernel,
and a loss of the dump image. There have been a few attempts to fix it over the
years. It seems as simple as setting a flag in native_machine_crash_shutdown()
and querying it in do_machine_check() to avoid mce & nmi race.

P.

>