Re: [PATCH for tip/mce3] x86, mce: Add options for corrected errors

From: Ingo Molnar
Date: Thu Jun 11 2009 - 05:47:23 EST



* Hidetoshi Seto <seto.hidetoshi@xxxxxxxxxxxxxx> wrote:

> [ Repost, rebased on tip/x86/mce3]
>
> This patch introduces three boot options (no_cmci, dont_log_ce and
> ignore_ce) to control handling for corrected errors.
>
> The "mce=no_cmci" boot option disables cmci feature. Since cmci is
> a new feature so having boot controls to disable it will be a help
> if the hardware is misbehaving.
>
> The "mce=dont_log_ce" boot option disables logging for corrected
> errors. All reported corrected errors will be cleared silently.
> This option will be useful if you never care corrected errors.
>
> The "mce=ignore_ce" boot option disables features for corrected
> errors, i.e. polling timer and cmci. All corrected events are not
> cleared and kept in bank MSRs. Usually this disablement is not
> recommended, however it will be a help if there are some conflict
> with the BIOS or hardware monitoring applications etc., that
> clears corrected events in banks instead of OS.

Applied to tip:x86/mce3, thanks Hidetoshi!

A few sidenote:

Please introduce a sysctl for these too, for those were the flag can
be safely toggled after bootup (most of them look to be such flags).
Admins might want to tweak these options without rebooting the
system.

Even for those flags where a toggle means having to touch MSRs to
deactivate/(reactivate) CMCI we should do the sysctl thing, as
no-reboot configurability is king in this space.

a few random details:

> static int mce_bootlog = -1;
> static int monarch_timeout = -1;
> static int mce_panic_timeout;
> +static int mce_dont_log_ce;
> +int mce_cmci_disabled;
> +int mce_ignore_ce;
> int mce_ser;

All rarely-modified variables should be declared __read_mostly.

> static char trigger[128];

Undocumented magic constant and meaninglessly named global variable,
please clean this up.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/