Re: [Patch 1/6] XPANIC: Add extended panic interface

From: Eric W. Biederman
Date: Fri May 27 2011 - 13:59:35 EST


"K.Prasad" <prasad@xxxxxxxxxxxxxxxxxx> writes:

> commit e668fa1aea7844ac4c7ea09030a2f3e647a4adb1
> Author: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> Date: Fri Nov 19 18:36:44 2010 +0100
>
> XPANIC: Add extended panic interface
>
> One panic size doesn't fit all.
>
> Machine check has some special requirements on panic, like:
> - It wants to do a reboot timeout by default so that the machine
> check event can be logged to disk after warm reboot.

There is certainly not comment to that effect and I have not see a
mechanism that does that.

> - For memory errors it usually doesn't want to do crash dumps because
> that can lead to crash loops (dumping corrupted memory would
> lead to another machine check)

It usually doesn't, and it would not lead to a crash loop just a failure
to capture the cause of the crash.

> - It doesn't want to do backtraces because machine checks
> are not a kernel bug.

But it sure is nice to know where it happened anyway, so you can guess
the damage. This sounds like more of the Andi thing where he doesn't
want to hear about memory errors, because it is not a software problem.

That is definitely not sufficient reason to turn off backtraces.

> In a earlier patch this was done with various adhoc hacks,
> but it's cleaner to extend panic to a 'xpanic' that directly
> gets a flag and timeout argument.
>
> The only user right now will be x86 machine checks, but I consider
> it likely that other users will switch to this too.
>
> For example one obvious candidate would be the "no root
> found" panic which doesn't really deserve a backtrace.

That does seem like a good fit for that case. However it doesn't seem
like a good fit for your case.

> I exported a vpanic() interface too as a global. That's not
> needed by the current user, but the interface has to exist
> internally anyways and I could see how other code would
> find a v* variant of panic useful.
>
> Originally based on a suggestion by H. Peter Anvin.
> Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>

Nacked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>

This entire chunk of two patches appears to be complete non-sense.

mce_panic_timeout is hard coded policy that userspace does not get
to control, that overrides userspace policy.

It looks to me like the right solution is just to delete the
mce_panic_timeout.

As for the extra flags that you add the panic path. They can
only make the code more fragile.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/