[tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs

From: tip-bot2 for Tony Luck
Date: Mon Nov 02 2020 - 06:18:16 EST


The following commit has been merged into the ras/core branch of tip:

Commit-ID: 68299a42f84288537ee3420c431ac0115ccb90b1
Gitweb: https://git.kernel.org/tip/68299a42f84288537ee3420c431ac0115ccb90b1
Author: Tony Luck <tony.luck@xxxxxxxxx>
AuthorDate: Fri, 30 Oct 2020 12:04:00 -07:00
Committer: Borislav Petkov <bp@xxxxxxx>
CommitterDate: Mon, 02 Nov 2020 11:15:59 +01:00

x86/mce: Enable additional error logging on certain Intel CPUs

The Xeon versions of Sandy Bridge, Ivy Bridge and Haswell support an
optional additional error logging mode which is enabled by an MSR.

Previously, this mode was enabled from the mcelog(8) tool via /dev/cpu,
but userspace should not be poking at MSRs. So move the enabling into
the kernel.

[ bp: Correct the explanation why this is done. ]

Suggested-by: Boris Petkov <bp@xxxxxxxxx>
Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
Signed-off-by: Borislav Petkov <bp@xxxxxxx>
Link: https://lkml.kernel.org/r/20201030190807.GA13884@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
---
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/mce/intel.c | 20 ++++++++++++++++++++
2 files changed, 21 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 972a34d..b2dd264 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -139,6 +139,7 @@
#define MSR_IA32_MCG_CAP 0x00000179
#define MSR_IA32_MCG_STATUS 0x0000017a
#define MSR_IA32_MCG_CTL 0x0000017b
+#define MSR_ERROR_CONTROL 0x0000017f
#define MSR_IA32_MCG_EXT_CTL 0x000004d0

#define MSR_OFFCORE_RSP_0 0x000001a6
diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
index abe9fe0..b47883e 100644
--- a/arch/x86/kernel/cpu/mce/intel.c
+++ b/arch/x86/kernel/cpu/mce/intel.c
@@ -509,12 +509,32 @@ static void intel_ppin_init(struct cpuinfo_x86 *c)
}
}

+/*
+ * Enable additional error logs from the integrated
+ * memory controller on processors that support this.
+ */
+static void intel_imc_init(struct cpuinfo_x86 *c)
+{
+ u64 error_control;
+
+ switch (c->x86_model) {
+ case INTEL_FAM6_SANDYBRIDGE_X:
+ case INTEL_FAM6_IVYBRIDGE_X:
+ case INTEL_FAM6_HASWELL_X:
+ rdmsrl(MSR_ERROR_CONTROL, error_control);
+ error_control |= 2;
+ wrmsrl(MSR_ERROR_CONTROL, error_control);
+ break;
+ }
+}
+
void mce_intel_feature_init(struct cpuinfo_x86 *c)
{
intel_init_thermal(c);
intel_init_cmci();
intel_init_lmce();
intel_ppin_init(c);
+ intel_imc_init(c);
}

void mce_intel_feature_clear(struct cpuinfo_x86 *c)