Re: [PATCH v2 1/2] x86/mce/AMD: Redo use of SMCA MCA_DE{STAT,ADDR} registers

From: Borislav Petkov
Date: Fri Apr 07 2017 - 17:35:24 EST


On Fri, Apr 07, 2017 at 08:37:03PM +0000, Ghannam, Yazen wrote:
> CEs will get picked up by polling or if we hit a threshold.

I think we should be pre-emptive here and simply log the error. No use
waiting until something polling finally gets its hands on it - we're
looking at the signature so we might just as well log it.

> Okay. But do we need a return value? I'm thinking we can go through all banks
> and log any and all Deferred errors rather than just the first one we find. I asked
> and this is the preferred method like how we do in the #MC handler. The same
> applies to the thresholding interrupt handler.

... and not only the deferred errors but the CEs too.

Which would make the whole code a *lot* simpler. You simply iterate over
banks:

for_each_bank()
log_error()

if (smca)
log_error_from_de_regs()

Purely pseudocode of course.

And there's no need to go and look whether the error is a deferred
error or whatnot. In the majority of the cases it will be because we're
in the #DF handler and it better be raised for a #DF.

But even if we see something else, we should simply log it. As long as
it is a valid error signature there's nothing wrong with us logging it
and clearing the regs. The earlier we do so, the lower the probability
for setting the overflow bit.

Yeah, that should be the simplest and most robust way without missing
out on any errors. Hmm, I like it.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.