Re: [PATCH -v2] x86, MCE: Drop default decoding notifier

From: Prarit Bhargava
Date: Wed Apr 13 2011 - 13:14:54 EST




On 04/13/2011 01:01 PM, Prarit Bhargava wrote:
>
>> @@ -239,7 +227,9 @@ static void print_mce(struct mce *m)
>> * Print out human-readable details about the MCE error,
>> * (if the CPU has an implementation for that)
>> */
>> - atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
>> + ret = atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
>> + if (ret != NOTIFY_STOP)
>> + pr_emerg(HW_ERR "Run the above through 'mcelog --ascii' to decode.\n");
>> }
>>
>>
> Borislav,
>
>

Oops. Let me *carefully* rephrase that so it is clear what I'm
complaining about.

> I still think you need the check for UC here. When an UC occurs and
> mce_panic() is called the output will include:
>
> [Hardware Error]: Run the above through 'mcelog --ascii' to decode.
>
> potentially many, many times

for _all_ unreported *correctable* errors.

> . The problem still is that there is no
> output to decode (in the default case).
>
>

ie) (sorry for the cut-and-paste)

/* First print corrected ones that are still unlogged */
for (i = 0; i < MCE_LOG_LEN; i++) {
struct mce *m = &mcelog.entry[i];
if (!(m->status & MCI_STATUS_VAL))
continue;
if (!(m->status & MCI_STATUS_UC)) {
print_mce(m);
if (!apei_err)
apei_err = apei_write_mce(m);
}
}

will potentially result in many bogus messages during a time at which we
definitely do not want bogus messages.

P.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/