Re: [PATCH -tip] x86, mce: CE in last bank prevents panic byunknown MCE

From: Ingo Molnar
Date: Wed Aug 26 2009 - 05:14:26 EST



* Hidetoshi Seto <seto.hidetoshi@xxxxxxxxxxxxxx> wrote:

> [based on tip/x86/mce]
>
> If MCE handler is called but none of mces_seen have machine check
> event which might signal the MCE (i.e. event higher than
> MCE_KEEP_SEVERITY), panic with "Machine check from unknown source"
> will be taken since the MCE is assumed to be signaled from
> external agent or so.
>
> Usually mces_seen never point MCE_KEEP_SEVERITY event such as CE.
> But it can happen because initial value of mces_seen is
> accidentally modified by mce_no_way_out() - in case if
> mce_no_way_out() run through all banks and the last bank has the
> CE, mces_seen points the CE and the "panic by unknown" will not be
> taken.
>
> This patch fix this undesired behavior, and clarify the logic.
>
> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@xxxxxxxxxxxxxx>
> Reported-by: Jin Dongming <jin.dongming@xxxxxxxxxxxxxxxxxx>
>
> ---
> arch/x86/kernel/cpu/mcheck/mce.c | 6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)

applied, thanks!

Btw., i had a quick look at
arch/x86/kernel/cpu/mcheck/mce-severity.c, and it is quite a pile of
unclean, over-engineered crap really.

Would you be interested in sending me a patch that converts that to
clean, proper C code that just checks the bits in a straightforward,
readable way? We dont need that silly, unreadable table with the
macro jungle and we definitely dont want to expose it via debugfs -
the debugfs bits can be removed altogether.

[ Plus in the mce_severity() implementation please rename 'a' to at
least 'm' - that name choice for a variable shows zero taste. We
dont program the kernel in BASIC with 'A', 'B' and 'C' variable
names anymore. ]

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/