Re: [NAK] Re: [PATCH -v2 9/9] ACPI, APEI, Generic Hardware ErrorSource POLL/IRQ/NMI notification type support

From: Borislav Petkov
Date: Mon Oct 25 2010 - 17:51:44 EST


On Mon, Oct 25, 2010 at 02:23:12PM -0700, Tony Luck wrote:
> On Mon, Oct 25, 2010 at 1:23 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> > You may be right but what we actually want is a consistent RAS
> > infrastructure. Didn't you point out at the last edac meeting in Boston
> > that concerning RAS Linux were in the stone ages? (at least this is what
> > I remember reading).
>
> That meeting was in San Francisco - but your recollection is correct.
> Right now we have ways to count errors, and to attribute them to
> specific hardware components (if we are lucky). This is only the
> beginning of the feature set that is needed to be "advanced RAS".

Yep.

> > What we should do is put all that post-system-reset error info, ECC
> > errors mapping to DRAM devices, L3 cache index manipulation based on
> > excessive errors - you name it - together and stick it in ras/ or
> > drivers/ras or whatever. And all with a nice and easy to use userspace
> > tool on top.
>
> This is what we should be working towards. I don't think we have
> a clear picture of what that high level infrastructure looks like. It
> needs to be very flexible to take input from all sorts of platform
> specific "driver" code that collects data. The "perf events"
> mechanism looks plausible as a transport mechanism for
> reporting corrected (or otherwise non-fatal) events.

Also agreed.

> But the errors that didn't kill the system are only part of the RAS
> picture.

Concerning fatal errors, take a look at drivers/edac/mce_amd.(c|h)Â -
this is not in arch/x86/ and still decodes MCEs in the kernel. And it
works fine - it even helped in several cases where people simply read
their serial console/dmesg and didn't have to collect it first and run
it through some tool to understand which functional unit in the CPU is
mchecking.

And I have an error injection module which can inject MCEs using 2
/sysfs files only. It is software injection only for now but still, you
can even inject in the shell.

So I think having RAS decoupled from x86 proper could work but it has to
be done smart. In the example above, I've hooked into the machine check
notifier chain. And yes, this is just an example but you get the idea
and where we want to go.

> > Now it looks like a wart on arch/x86/ which truly doesn't belong there.
> > And I don't buy all that crap that it can't be done right.
>
> Of course it is a wart ... look up ACPI in any dictionary and you'll
> find a picture of a stereotypical Halloween witch :-)

LOL. Can I have the ACPI witch on a t-shirt please?

> I don't see other architectures lining up to support ACPI ... but
> we shouldn't just ignore it in x86. The APEI pieces that were added
> to ACPI 4.0 have some interesting and useful features.

Yeah, I like the error info after system reset thing using nvram.
This might make a lot of sense with certain critical errors where we
syncflood the ht links to prevent corrupted data propagation.

> Most of them are already implemented on shipping platforms because
> the APEI bits were simply documenting WHEA (Windows Hardware Error
> Architecture) features. Look for this stuff in dmesg:
>
> ACPI: HEST 000000007fb1c000 000A8 (v01 INTEL SFC4UR 00000001 INTL 00000001)
> ACPI: BERT 000000007fb1b000 00030 (v01 INTEL SFC4UR 00000001 INTL 00000001)
> ACPI: ERST 000000007fb1a000 00230 (v01 INTEL SFC4UR 00000001 INTL 00000001)
> ACPI: EINJ 000000007fb19000 00130 (v01 INTEL SFC4UR 00000001 INTL 00000001)

Yeah, I'm not saying we shouldn't have those - I'm just saying that we
should think up a good infrastructure design first and hook them into it
so that users can have the highest benefit.

And then have a single tool which talks to the high level interface and
all different arches implement whatever makes sense for them. Basically
what you said above about the flexible high level infrastructure.

Thanks.

Âin current git, in 2.6.36 the files were called edac_mce_amd(c|h).

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/