Re: [RFC PATCH 0/8] EDAC, mce_amd: Add a tracepoint for the decoded error

From: Ingo Molnar
Date: Thu Jul 27 2017 - 03:10:44 EST



* Borislav Petkov <bp@xxxxxxxxx> wrote:

> From: Borislav Petkov <bp@xxxxxxx>
>
> Hi,
>
> here's a first stab at adding a tracepoint which dumps the decoded MCE
> string to userspace. The main idea is to have the decoding functionality
> in the kernel and depending on whether you have userspace consumers
> listening or not, to dump the error to the tracepoint or to dmesg.
>
> In either case, we do the decoding in the kernel and don't need special
> userspace. Furthermore, adding new CPU support will have to be done only
> in one place.
>
> First 6 patches are cleanups which are good to have regardless, IMO.
>
> Any constructive comments and suggestions are appreciated.
>
> Thanks.
>
> P.S., Thanks to Rostedt for the input!
>
> Borislav Petkov (8):
> EDAC, mce_amd: Rename decode_smca_errors() to decode_smca_error()
> EDAC, mce_amd: Get rid of most struct cpuinfo_x86 uses
> EDAC, mce_amd: Get rid of local var in amd_filter_mce()
> seq_buf: Add seq_buf_clear_buf()
> seq_buf: Export seq_buf_printf() to modules
> EDAC, mce_amd: Convert to seq_buf
> EDAC, mce_amd: Add a simple tracepoint dumping a decoded string
> EDAC, mce_amd: Issue the decoded info through the TP or printk
>
> drivers/edac/mce_amd.c | 285 +++++++++++++++++++++++++++---------------------
> drivers/ras/ras.c | 1 +
> include/linux/seq_buf.h | 7 ++
> include/ras/ras_event.h | 16 +++
> lib/seq_buf.c | 1 +
> 5 files changed, 186 insertions(+), 124 deletions(-)

Looks pretty nice to me conceptually. Do you have a couple of examples of
real-life events that get logged? It's hard to decode it from the new tracepoint
alone.

Thanks,

Ingo