Re: [HW PROBLEM] Intel I7 MCE. Erratum or not?

From: Giangiacomo Mariotti
Date: Sat Dec 06 2008 - 17:16:42 EST

On Sat, Dec 6, 2008 at 10:47 PM, Robert Hancock <hancockr@xxxxxxx> wrote:
> Giangiacomo Mariotti wrote:
>> On Sat, Dec 6, 2008 at 9:58 PM, Robert Hancock <hancockr@xxxxxxx> wrote:
>>> Giangiacomo Mariotti wrote:
>>>> Hi everyone,
>>>> Mcelog just logged on my new Intel I7 920 (on Linux this :
>>>> MCE 0
>>>> HARDWARE ERROR. This is *NOT* a software problem!
>>>> Please contact your hardware vendor
>>>> CPU 0 BANK 6 MISC 202d ADDR ffeef740
>>>> MCG status:
>>>> MCi status:
>>>> Error overflow
>>>> Uncorrected error
>>>> MCi_MISC register valid
>>>> MCi_ADDR register valid
>>>> Processor context corrupt
>>>> MCA: Generic CACHE Level-2 Data-Write Error
>>>> STATUS ee0000000100014a MCGSTATUS 0
>>>> I'm reporting this here, because I found in the Intel I7 Technical
>>>> Specification November 2008 update that something which seems very
>>>> similar is in fact an erratum. So my question is : Is there any way
>>>> for me to verify that my problem is due to one of those errata,instead
>>>> of a broken hardware(if we don't want to consider all those errata as
>>>> broken hardware)? I'm also reporting this because I thought it may be
>>>> useful to signal that(if actually due to those errata) these problems
>>>> actually occur, so it may be useful to find workarounds in the kernel
>>>> to not scare to death poor Linux users!
>>> Which erratum are you talking about? I don't see one in that document
>>> that
>>> would match this case..
>> Well, the first one seems very similar, even if it talks about a dtlb
>> error instead of cache error. But sure,being similar doesn't mean too
>> much. Number 52 seems similar too. I guess I should just give up and
>> admit that my hardware is broken!
> The first one is just indicating that if a DTLB error occurs the overflow
> bit may be set incorrectly. It's not a false error though. The AAJ52 erratum
> would only occur immediately after powerup or wake from sleep states.
The mce actually got logged once immediately after powerup and never
more. Is that reasonable? A cache error which happens just once after
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at