[GIT PULL] ras/core for v6.8

From: Borislav Petkov
Date: Sun Jan 07 2024 - 07:57:15 EST


Hi Linus,

please pull a bunch of RAS changes for v6.8.

Thx.

---

The following changes since commit b85ea95d086471afb4ad062012a4d73cd328fa86:

Linux 6.7-rc1 (2023-11-12 16:19:07 -0800)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/ras_core_for_v6.8

for you to fetch changes up to 1f68ce2a027250aeeb1756391110cdc4dc97c797:

x86/mce: Handle Intel threshold interrupt storms (2023-12-15 14:53:42 +0100)

----------------------------------------------------------------
- Convert the hw error storm handling into a finer-grained, per-bank
solution which allows for more timely detection and reporting of
errors

- Start a documentation section which will hold down relevant
RAS features description and how they should be used

- Add new AMD error bank types

- Slim down and remove error type descriptions from the kernel side of
error decoding to rasdaemon which can be used from now on to decode
hw errors on AMD

- Mark pages containing uncorrectable errors as poison so that kdump can
avoid them and thus not cause another panic

- The usual cleanups and fixlets

----------------------------------------------------------------
Borislav Petkov (AMD) (1):
Documentation: Begin a RAS section

Muralidhara M K (2):
EDAC/mce_amd: Remove SMCA Extended Error code descriptions
x86/MCE/AMD: Add new MA_LLC, USR_DP, and USR_CP bank types

Nikolay Borisov (1):
x86/mce: Remove redundant check from mce_device_create()

Tony Luck (3):
x86/mce: Remove old CMCI storm mitigation code
x86/mce: Add per-bank CMCI storm mitigation
x86/mce: Handle Intel threshold interrupt storms

Yazen Ghannam (2):
x86/mce/inject: Clear test status value
x86/mce/amd, EDAC/mce_amd: Move long names to decoder module

Zhiquan Li (1):
x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel

Documentation/RAS/ras.rst | 26 ++
Documentation/index.rst | 1 +
arch/x86/include/asm/mce.h | 4 +-
arch/x86/kernel/cpu/mce/amd.c | 80 +++---
arch/x86/kernel/cpu/mce/core.c | 72 +++--
arch/x86/kernel/cpu/mce/inject.c | 1 +
arch/x86/kernel/cpu/mce/intel.c | 304 +++++++++------------
arch/x86/kernel/cpu/mce/internal.h | 66 ++++-
arch/x86/kernel/cpu/mce/threshold.c | 115 ++++++++
drivers/edac/mce_amd.c | 526 +++---------------------------------
10 files changed, 457 insertions(+), 738 deletions(-)
create mode 100644 Documentation/RAS/ras.rst


--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette