Re: [RCF] Linux memory error handling

From: Joel Schopp
Date: Wed Jun 15 2005 - 15:49:45 EST


A common question is whether single bit (corrected) errors will turn into double bit (uncorrected) errors. The answer is it
depends on the underlying cause of the memory error. There are
some errors that show up as single bits, especially transient and soft errors, that do not degrade over time. There are other
failures that do degrade over time.

This sounds like one of our primary motivations for working on memory hotplug remove. Detection of recoverable errors that degrade to unrecoverable errors, but don't because we remove the memory before it gets that far.

Much PPC64 hardware/firmware already supports this detection.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/