RE: Kernel Panic with Rawtherapee (mce related)

From: Luck, Tony
Date: Wed Mar 14 2012 - 13:52:49 EST


> You're getting a bunch of machine checks, the last one of them being
> fatal (Process Context Corrupt bit is set) causing the machine to panic.

PCC is set in all of them

> Tony will probably be able to help you further in decoding what exactly
> those MC0_STATUS and MC5_STATUS values mean

Bank 5 ends in 0400 - which means "Internal timer error". Bank 0 has 0800
which is a bus/interconnect error where this processor was the source of
a memory transaction.

That's where the facts end - speculation begins here ...

Since this is repeatable under load - it's possible that a page table got
corrupted and you are trying to access some non-existent memory location?
Do all traces for this panic involve *_tlb_* functions?

Or perhaps you have a cooling problem - and when stressed your cpu or
memory is getting too hot?

-Tony

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/