Re: Aerospace and linux

From: Henrique de Moraes Holschuh
Date: Fri Jun 11 2010 - 10:37:51 EST


On Thu, 10 Jun 2010, Chris Friesen wrote:
> On 06/10/2010 11:29 AM, Brian Gordon wrote:
> > When these SEU can be detected some action may be taken to improve
> > the behaviour of the system (log a fault and reset in order to
> > refresh things from scratch?). So the first question becomes how to
> > detect an SEU.
>
> I do work in telco stuff. We use ECC RAM, turn on ECC/parity on the
> various buses, enable error-checking in the hardware, etc.

Let's not forget that the hardware better have unassisted scrubbing
(rewrite cells where an CE is detected), because we don't scrub when
we are notified of a CE.

Background scrubbing might also be something to look for (run over all
RAM over a large period of time, to catch dormant CEs and fix them
before they become UEs).

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/