Re: [RFC v3] debug: prevent entering debug mode on errors

From: Daniel Thompson
Date: Thu Nov 27 2014 - 04:50:17 EST


On 26/11/14 17:45, Colin Cross wrote:
> On Wed, Nov 26, 2014 at 1:14 AM, Kiran Raparthy <kiran.kumar@xxxxxxxxxx> wrote:
>> From: Colin Cross <ccross@xxxxxxxxxxx>
>>
>> debug: prevent entering debug mode on errors
>>
>> On non-developer devices kgdb prevents CONFIG_PANIC_TIMEOUT from rebooting the
>> device after a panic.
>>
>> In case of panics and exceptions, to honor CONFIG_PANIC_TIMEOUT, prevent
>> entering debug mode to avoid getting stuck waiting for the user to interact
>> with debugger.
>>
>> Cc: Jason Wessel <jason.wessel@xxxxxxxxxxxxx>
>> Cc: kgdb-bugreport@xxxxxxxxxxxxxxxxxxxxx
>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>> Cc: Android Kernel Team <kernel-team@xxxxxxxxxxx>
>> Cc: John Stultz <john.stultz@xxxxxxxxxx>
>> Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx>
>> Signed-off-by: Colin Cross <ccross@xxxxxxxxxxx>
>> [Kiran: Added context to commit message.
>> panic_timeout is used instead of break_on_panic and
>> break_on_exception to honor CONFIG_PANIC_TIMEOUT]
>> Signed-off-by: Kiran Raparthy <kiran.kumar@xxxxxxxxxx>
>> ---
>> kernel/debug/debug_core.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
>> index 1adf62b..0012a1f 100644
>> --- a/kernel/debug/debug_core.c
>> +++ b/kernel/debug/debug_core.c
>> @@ -689,6 +689,14 @@ kgdb_handle_exception(int evector, int signo, int ecode, struct pt_regs *regs)
>>
>> if (arch_kgdb_ops.enable_nmi)
>> arch_kgdb_ops.enable_nmi(0);
>> + /*
>> + * Avoid entering the debugger if we were triggered due to an oops
>> + * but panic_timeout indicates the system should automatically
>> + * reboot on panic. We don't want to get stuck waiting for input
>> + * on such systems, especially if its "just" an oops.
>> + */
>> + if (signo != SIGTRAP && panic_timeout)
>> + return 1;
>>
>> memset(ks, 0, sizeof(struct kgdb_state));
>> ks->cpu = raw_smp_processor_id();
>> @@ -821,6 +829,15 @@ static int kgdb_panic_event(struct notifier_block *self,
>> unsigned long val,
>> void *data)
>> {
>> + /*
>> + * Avoid entering the debugger if we were triggered due to a panic
>> + * We don't want to get stuck waiting for input from user in such case.
>> + * panic_timeout indicates the system should automatically
>> + * reboot on panic.
>> + */
>> + if (panic_timeout)
>> + return NOTIFY_DONE;
>> +
>> if (dbg_kdb_mode)
>> kdb_printf("PANIC: %s\n", (char *)data);
>> kgdb_breakpoint();
>
> The original patch was more useful as it allowed re-enabling break on
> panic on specific devices where you were trying to debug a
> reproducible issue. What about using a module_param similar to
> kgdbreboot, but setting the default based on CONFIG_PANIC_TIMEOUT to
> avoid extra configuration?

This change was due to my review so perhaps I'd better answer this...

panic_timeout is the value of the panic sysctl. In addition to the
normal sysctl tooling (which I don't think is available on most android
systems), its value can be set using panic=0 on the kernel command line
or via /proc/sys/kernel/panic at runtime.

CONFIG_PANIC_TIMEOUT merely sets the default value of the sysctl. I
guess perhaps the patch description could be improved to make this clearer.

Therefore, the only loss of function I expected versus the original is
that it would be hard to get as far as a reproducible panic if the
system also has a ton of reproducible oopses that we don't want to fix.
Is such a use-case important?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/