[tip:ras/core] x86/mce: Log MCEs after a warm rest on AMD, Fam17h and later

From: tip-bot for Aravind Gopalakrishnan
Date: Tue May 03 2016 - 03:48:27 EST


Commit-ID: 10001d91aa0efc793952051f9070a569cc388ebc
Gitweb: http://git.kernel.org/tip/10001d91aa0efc793952051f9070a569cc388ebc
Author: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@xxxxxxx>
AuthorDate: Sat, 30 Apr 2016 14:33:51 +0200
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Tue, 3 May 2016 08:24:15 +0200

x86/mce: Log MCEs after a warm rest on AMD, Fam17h and later

For Fam17h, we want to report errors that persist across reboots. Error
persistence is dependent on HW and no BIOS currently fiddles with values
here. So allow reporting of errors upon boot until something goes wrong.

Logging is disabled on older families because BIOS didn't clear the MCA
banks after a cold reset.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@xxxxxxx>
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@xxxxxxx>
Signed-off-by: Borislav Petkov <bp@xxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@xxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Brian Gerst <brgerst@xxxxxxxxx>
Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Tony Luck <tony.luck@xxxxxxxxx>
Cc: linux-edac <linux-edac@xxxxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1459886686-13977-2-git-send-email-Yazen.Ghannam@xxxxxxx
Link: http://lkml.kernel.org/r/1462019637-16474-2-git-send-email-bp@xxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
arch/x86/kernel/cpu/mcheck/mce.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 6b7039c..43f8b49 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1495,7 +1495,7 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
*/
clear_bit(10, (unsigned long *)&mce_banks[4].ctl);
}
- if (c->x86 <= 17 && cfg->bootlog < 0) {
+ if (c->x86 < 17 && cfg->bootlog < 0) {
/*
* Lots of broken BIOS around that don't clear them
* by default and leave crap in there. Don't log: