Re: [PATCH 02/11] x86/nmi, EDAC: Get rid of DRAM error reporting thru PCI SERR NMI

From: Thomas Gleixner
Date: Mon Apr 10 2017 - 09:40:16 EST


On Thu, 6 Apr 2017, Borislav Petkov wrote:
> From: Borislav Petkov <bp@xxxxxxx>
>
> Apparently, some machines used to report DRAM errors through a PCI SERR
> NMI. This is why we have a call into EDAC in the NMI handler. See
>
> c0d121720220 ("drivers/edac: add new nmi rescan").
>
> >From looking at the patch above, that's two drivers: e752x_edac.c and

Stray '>'

> e7xxx_edac.c. Now, I wanna say those are old machines which are probably
> decommissioned already.
>
> Tony says that "[t]the newest CPU supported by either of those drivers
> is the Xeon E7520 (a.k.a. "Nehalem") released in Q1'2010. Possibly some
> folks are still using these ... but people that hold onto h/w for 7
> years generally cling to old s/w too ... so I'd guess it unlikely that
> we will get complaints for breaking these in upstream."
>
> So even if there is a small number still in use, we did load EDAC with
> edac_op_state == EDAC_OPSTATE_POLL by default (we still do, in fact)
> which means a default EDAC setup without any parameters supplied on the
> command line or otherwise would never even log the error in the NMI
> handler because we're polling by default:
>
> inline int edac_handler_set(void)
> {
> if (edac_op_state == EDAC_OPSTATE_POLL)
> return 0;
>
> return atomic_read(&edac_handlers);
> }
>
> So, long story short, I'd like to get rid of that nastiness called
> edac_stub.c and confine all the EDAC drivers solely to drivers/edac/. If
> we ever have to do stuff like that again, it should be notifiers we're

Notifiers? You mean a proper NMI handler, right?

Other than that: Acked-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

Thanks,

tglx