Re: [Patch-next] Remove notify_die in do_machine_check functioin

From: Andi Kleen
Date: Thu May 27 2010 - 02:54:58 EST



I have heard about that on some machine, some hardware error output pin
of chipset may be linked with some input pin of CPU which can cause MCE.

Yes that happens.

That is, MCE is used to report some chipset errors too. I think that is
why notify_die is called in do_machine_check. Simply removing notify_die
is not good for these machines.

In general deciding what to do on a MCE is rather complicated
and probably too much for any die handler.

Maybe we should fix the notifier user instead. Which notifier user
consumes the DIE_NMI notification?

Yes. It would be good to find out which user it is. Perhaps gdb?

One approach would be to give it a different type (DIE_MCE)

But today we don't really need it. notify_die() is primarily for debuggers
of all kinds, and I never liked the idea to call a debugger on a machine
check.

So I would be ok with just removing the call.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/