Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

From: Mauro Carvalho Chehab
Date: Mon Jul 24 2017 - 14:11:28 EST


Em Mon, 24 Jul 2017 18:44:00 +0200
Borislav Petkov <bp@xxxxxxxxx> escreveu:

> On Mon, Jul 24, 2017 at 01:04:13PM -0300, Mauro Carvalho Chehab wrote:
> > If the Kernel force those users to use ghes_edac by default,
> > they they won't see the error counts anymore, but, instead,
> > hardware reports that the memories need to be replaced.
>
> This is exactly why I'm trying to load ghes_edac only on those platforms
> which would really want it.
>
> > So, the right solution would be to keep hardware first, but
> > providing a modprobe parameter to let them switch to software
> > first.
>
> That's exactly the issue: if we make it spec-conform and adhere to FF
> setting, then it'll be clean. BUT(!), we will force ghes_edac on those
> platforms which potentially are using the platform-specific drivers
> until now. Not good.
>
> If we do the whitelisting, then we're stuck with maintaining a yucky
> whitelist and have to keep updating ghes_edac with it.

Yeah, having a whitelist is a maintainership's burden, but, on
the other hand, I suspect that there aren't many systems that
implement FF, have a reliable BIOS mapping of MB's silkscreen
and doesn't filters out corrected errors using some sort of
undocumented mechanism.

So, I guess it is doable.

Another alternative, with, IMO, is better would be to add a parameter like:

edac=FF - firmware first;
edac=hw - hardware first;
edac=auto - honors FF if set in BIOS. Otherwise, hardware first.

In order to avoid regressions, and to avoid the need of a whitelist,
I would keep "edac=hw" as default.

Thanks,
Mauro