RE: Hardware Error Kernel Mini-Summit

From: Luck, Tony
Date: Tue Jun 15 2010 - 14:15:38 EST


> Could we come up with some plan that doesn't involve
> trusting to the goodwill (and competence) of BIOS writes?

That would be nice - but there already exists a platform
(Xeon-7500 series a.k.a. Nehalem-EX) where the hardware
chipset registers that you would need to do your own
memory topology reverse engineering in Linux are only
accessible to SMM level code. I've finally come to the
conclusion that an EDAC style driver just isn't possible
for this set of systems.

>I personally really like the device tree compiler for PowerPC.
>It allows you to be explicit about what you have. Not for everyone,
>but maybe there could be some way to apply the same principle? Maybe
>some way of loading modules with parameters or configuring your setup
>from sysfs?

Even when the chip set registers are accessible, it can be very
complex to do this for the general case (think of boards that
support arbitrary mixing of different size/speed DIMMs - the
BIOS may have done some interesting somersaults while computing
which interleaving modes to use).

Even more complex on high end systems when BIOS may handle row
sparing transparently to the OS. Memory mirroring is also
becoming fashionable - how can EDAC represent this (when
the h/w view of the memory doesn't match the OS view)?

-Tony


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/