Re: [NAK] Re: [PATCH -v2 9/9] ACPI, APEI, Generic Hardware ErrorSource POLL/IRQ/NMI notification type support

From: Ingo Molnar
Date: Mon Oct 25 2010 - 05:26:30 EST



* Huang Ying <ying.huang@xxxxxxxxx> wrote:

> > Sigh, please integrate all this into EDAC (drivers/edac/) properly, instead of
> > turning it into YET ANOTHER hardware vendor special hw-errors thing. We can do
> > better than this. EDAC is almost there: it has support for Nehalem, AMD, a
> > couple of older chips.
>
> I think APEI (ACPI Platform Error Interface) is another driver. Why integrate two
> drivers?

Sigh. I did not say integrate the drivers - integrate the _error event facilities_.

You can have drivers/edac/apei/ghes* bits just fine (in fact it would be desirable,
to keep things modular).

Really, just read the two Kconfig entries:

bool "EDAC (Error Detection And Correction) reporting"

EDAC is designed to report errors in the core system.
These are low-level errors that are reported in the CPU or
supporting chipset or other subsystems:
memory errors, cache errors, PCI errors, thermal throttling, etc..

...

tristate "APEI Generic Hardware Error Source"

Generic Hardware Error Source provides a way to report
platform hardware errors (such as that from chipset).

drivers/acpi/apei/ overlaps and duplicates drivers/edac/. We dont want two
facilities, two ABIs, two sets of behavior. erst-dbg even defines a /dev node with
two ioctls, and a debugfs file to read/write records ...

I have NAK-ed various attempts to extend /dev/mcelog and asked for it to be done
properly, and work has begun on that - but the debugfs interface here just tries to
work around those objections by stealth.

I'd like you and Andi to listen not just to the letter of NAKs but to the spirit as
well. If you get a NAK in one subsystem you should not just try to route around the
NAK, go to some other subsystem, figure out a slightly different scheme and try to
sneak crap upstream ...

If you disagree with the mcelog NAK, if you disagree with EDAC directions then at
least do it openly and honestly and Cc: the parties that sent you the NAK and work
with the EDAC guys to migrate to the facility you are advancing ...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/