Re: Linux & ECC memory

Flynn, Jason, FLYNNJS (flynnjs@btlip15.bt.co.uk)
Fri, 15 Nov 96 08:39:00 UCT


Eric Wrote :
>It is much more interesting to me to have the kernel tell me that a
>correction was made (soft error) so that I have the opportunity to replace
>it before it degrades and a hard error occurs.

Is this likely? My understanding of a soft error is that your memory got hit
by
some radiation or affected by a power glitch and that they are random
instances not anything to do with a dodgy bit - therefore unlikely to happen
again.

If a bit is dodgy in itself and is on it's on the blink then this should be
classed
as a hard fault from the outset.

Hard to tell the difference, I know, but just my penneth worth...

(Just as an aside, what about these smart memory modules ?
They detect a word that is alway failing its check then they re-map to some
spare
words. This is totally transparent and these things should go in a MB
without and
chipset wizardry...)

J