Re: [NAK] Re: [PATCH -v2 9/9] ACPI, APEI, Generic Hardware ErrorSource POLL/IRQ/NMI notification type support

From: Borislav Petkov
Date: Mon Oct 25 2010 - 16:23:54 EST


On Mon, Oct 25, 2010 at 10:14:52AM -0700, Tony Luck wrote:
> On Mon, Oct 25, 2010 at 2:25 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
>
> > drivers/acpi/apei/ overlaps and duplicates drivers/edac/. We dont want two
> > facilities, two ABIs, two sets of behavior. erst-dbg even defines a /dev node with
> > two ioctls, and a debugfs file to read/write records ...
>
> As mentioned above these 4-letter names from from the ACPI specification. ERST
> is perhaps the dumbest name of them all - "Error Record Serialization Table" is
> ACPI-speak for platform level non-volatile memory. This code simply provides
> a mechanism for Linux to stash some information in nvram before the system is
> reset, and to retrieve it after the reboot.
>
> The naming could be better - but I don't see any overlap with EDAC here.

You may be right but what we actually want is a consistent RAS
infrastructure. Didn't you point out at the last edac meeting in Boston
that concerning RAS Linux were in the stone ages? (at least this is what
I remember reading).

What we should do is put all that post-system-reset error info, ECC
errors mapping to DRAM devices, L3 cache index manipulation based on
excessive errors - you name it - together and stick it in ras/ or
drivers/ras or whatever. And all with a nice and easy to use userspace
tool on top.

Now it looks like a wart on arch/x86/ which truly doesn't belong there.
And I don't buy all that crap that it can't be done right.

Thanks.

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/