Re: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ffmode for corrected errors

From: Borislav Petkov
Date: Wed Jun 19 2013 - 14:36:54 EST


On Wed, Jun 19, 2013 at 06:19:25PM +0000, Luck, Tony wrote:
> > Interesting, why? Why would we even need such an option? My impression
> > is, if ACPI tells us FF, MCE code doesn't poll those banks anymore. So
> > where do the duplicated reports come from?
>
> The option is only disabling the Linux side of firmware first ... the BIOS
> will still be doing it and generating records to feed to the OS using APEI.
>
> So Linux may see the error in a bank and report it, and BIOS may report
> the same error. Though I'd expect that to be rare as whoever saw it first
> would most likely clear the bank before the other could see it.
>
> I asked for the option because I'm nervous about just skipping some banks
> on the say-so of the BIOS ... what if the BIOS did something wrong. This
> option gives us a way to return to the way things were before this patch.

Yeah, the code I saw only disables the banks in the HEST:

mce_disable_ce_bank(mc_bank->bank_number)

and leaving the rest in poll mode. But I agree, we need this as a
fallback if BIOS is doing other crack smoking exercises and thus we want
to ignore FF completely.

> These parts are now looking good ... but we still need to tackle what
> Linux does when it does get the CPER record. I suspect we need to
> preserve the existing "fake an mcelog entry with just the address" on
> old platforms, but need to do something smarter on new ones.

Why, fill out struct mce and do mce_log(mce) does not suffice?

I'll take a look at the rest of the stuff tomorrow, on a clear head.

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/