Re: EDAC k8 MC1: unknown syndrome

From: Andi Kleen
Date: Thu Aug 09 2007 - 11:07:33 EST


Phil Moors <pmoors@xxxxxxxxxxxxx> writes:
>
> But is the error really reporting a DIMM problem or a race condition in
> the kernel code handling ECC?

First if you don't use a mainline kernel you should at least mention
it in the report.

It could well be a race -- the k8 DAC code is always racing against
the builtin machine check handler which reads the same registers.
That is why it was never accepted for mainline and shouldn't be used.

However these races should normally only cause lost events, not
bogus data, so it's probably something the CPU really reported
and EDAC doesn't understand.

The builtin machine check code will report all the same information
through mcelog (on x86-64) or syslog (on i386)

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/