Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)

From: Tony Luck
Date: Wed Jun 29 2011 - 17:24:49 EST


One extra consideration for this whole proposal ...

Is the "physical address" a stable enough representation of the location
of the faulty memory cells?

On high end systems I can see a number of ways where the mapping
from cells to physical address may change across reboot:

1) System support redundant memory (rank sparing or mirroring)
2) BIOS self test removes some memory from use
3) A multi-node system elects a different node to be boot-meister,
which results in reshuffling of the address map.

If any of these can happen: then it doesn't matter whether we have
a list of addresses, or a pattern that expands to a list of addresses.
We'll still mark some innocent memory as bad, and allow some known
bad memory to be used - because our "addresses" no longer correspond
to the bad memory cells.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/