Re: Software based ECC ?

From: Jan Engelhardt
Date: Sun Aug 12 2007 - 13:07:32 EST



On Aug 12 2007 18:51, Folkert van Heusden wrote:
>
>> > http://pdos.csail.mit.edu/papers/softecc:ddopson-meng/softecc_ddopson-meng.pdf
>> > "SoftECC : A System for Software Memory Integrity Checking"
>>
>> Personally, I'd recommend just shelling out the bucks for hardware ECC if
>> the reliability matters.
>
>a question and an idea: Q: is ecc guaranteed to detect all bitflips?
>
>Idea: what about a multicore system (3 or more) that runs the same
>processes on 2 cores and a third core verifying that they both do the
>same? As I think it is not only ram that can become faulty.

Indeed. And for example BOINC (Seti@home) have to consider this. Hence they
recalculate each work unit at least three times and then compare between
each. What makes this different from ECC is that the checksum is not calculated
on every memory operations, but at the end of a larger block of operations. Of
course this may mean that an error can propagate for a while, but the total
walltime (including recomputation) is lower. :)


Jan
--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/