Re: Linux & ECC memory

Richard B. Johnson (root@analogic.com)
Thu, 14 Nov 1996 21:34:33 -0500 (EST)


On Thu, 14 Nov 1996, Kenneth Albanowski wrote:

>
> This is what I'm curious about. Does Linux's NMI code attempt to work
> around some memory problems, or does it just panic? Also, can the glue
> report successful error-correction, as well as failed error-correction?
> (Or is it not useful to know if your memory has/had an error that was
> correctable?)
>
There would have to be someplace (not in available RAM) to store statistics
about corrections that occur. Old VAXen do that. The memory controller
chips have "ports" than can be read by the OS to obtain statistics. VAX/VMS
will write an error log message about what was done and what "page" of
memory failed or had to be corrected. If the pages were in kernel space,
the system crashes after writing a crash-dump file. If the page was not
yet written (used), it is mapped out and the system continues. Pages on
old VAXen are only 512 bytes. Such systems will run happily with only
2 megabytes of RAM with about 1/2 or it bad as long as the bad RAM wasn't
in use by the kernel.

To my knowledge, there are not any "modern" computers that handle memory
errors this well. An uncorrected memory error on SGIs, Suns, and even Crays
are fatal. However, on both the SGIs and Grays ECC corrects most memory
errors so the fact that they occurred at all is not known to anyone. No
statistics are kept. I think some of the newer, faster Suns have ECC. If
so, they should respond the same way.

Since RAM access happens so quickly, keeping statistics on corrections
that have occurred could seriously impact performance. The performance of
VAXes was not seriously affected because they were soooooo sloooow to
begin with.

Cheers,
Dick Johnson
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Richard B. Johnson
Project Engineer
Analogic Corporation
Voice : (508) 977-3000 ext. 3754
Fax : (508) 532-6097
Modem : (508) 977-6870
Ftp : ftp@boneserver.analogic.com
Email : rjohnson@analogic.com, johnson@analogic.com
Penguin : Linux version 2.1.9 on an i586 machine.
Warning : It's hard to remain at the trailing edge of technology.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-