[RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

From: Eugeniu Rosca
Date: Sun Mar 15 2020 - 13:15:04 EST


As pointed out in v5.0 commit 168e06f7937d96 ("kernel/hung_task.c:
force console verbose before panic"), users may miss crucial information
in pre-panic scenarios, depending on the loglevel value set by the
distribion/platform maintainers.

For example, assuming loglevel=3 (i.e. print EMERG/ALERT/CRIT only),
below is the output of a softlockup got with pure vanilla kernel [1].
In contrast, with current patch applied, output [2] is far more helpful.

Save potentially days/weeks/months of debugging efforts for those who:
* deal with a softlockup whose reproduction scenario is not clear
* are constrainted to use a low loglevel value in production
* mostly/exclusively rely on logs like [2] post-mortem

[1] Softlockup output [loglevel=3, pure v5.6-rc5, H3ULCB]
watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [modprobe:2354]

[2] Softlockup output [loglevel=3, v5.6-rc5 + this patch, H3ULCB]
[ 292.259324] watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [modprobe:2437]
[ 292.266834] Modules linked in: <-- cut -->
[ 292.302880] irq event stamp: 519660
[ 292.306555] hardirqs last enabled at (519659): [<ffffa00010082a18>] el1_irq+0xd8/0x180
[ 292.314909] hardirqs last disabled at (519660): [<ffffa000100829bc>] el1_irq+0x7c/0x180
[ 292.323274] softirqs last enabled at (519658): [<ffffa00010080d0c>] __do_softirq+0x2e4/0x7c4
[ 292.332179] softirqs last disabled at (519651): [<ffffa000100c3e48>] irq_exit+0x150/0x1e0
[ 292.340720] CPU: 3 PID: 2437 Comm: modprobe Not tainted 5.6.0-rc5+ #96
[ 292.347538] Hardware name: Renesas H3ULCB Kingfisher board based on r8a77951 (DT)
[ 292.355355] pstate: 20000005 (nzCv daif -PAN -UAO)
[ 292.360435] pc : lkdtm_SOFTLOCKUP+0x18/0x20 [lkdtm]
[ 292.365592] lr : lkdtm_SOFTLOCKUP+0x18/0x20 [lkdtm]
[ 292.370699] sp : ffff0006cd9af780
[ 292.374187] x29: ffff0006cd9af780 x28: ffffa0000907d000
[ 292.379757] x27: ffff0006cbaebc10 x26: ffffa000090751c0
[ 292.385327] x25: ffffa0000907dfc0 x24: 0000000000000009
[ 292.390896] x23: ffffa0000907d478 x22: 0000000000000000
[ 292.396464] x21: ffffa0000907d340 x20: ffffa0000907d340
[ 292.402034] x19: ffffa000090751c0 x18: 0000000000000000
[ 292.407603] x17: 0000000000000000 x16: 0000000000000000
[ 292.413171] x15: 0000000000000000 x14: ffffa00000000003
[ 292.418740] x13: ffffa00010dc0180 x12: 0000000000000000
[ 292.424309] x11: ffff8000bfc2401d x10: 1fffe000bfc2401d
[ 292.429877] x9 : dfffa00000000000 x8 : ffffa00010082c00
[ 292.435445] x7 : ffff0005fe1200ef x6 : 0000000000000000
[ 292.441012] x5 : dfffa00000000000 x4 : ffff0006c82e9c50
[ 292.446580] x3 : ffffa0001024b848 x2 : 0000000000000001
[ 292.452147] x1 : 0000000000000000 x0 : 0000000000000000
[ 292.457713] Call trace:
[ 292.460351] lkdtm_SOFTLOCKUP+0x18/0x20 [lkdtm]
[ 292.465147] lkdtm_do_action+0x40/0x44 [lkdtm]
[ 292.469853] lkdtm_register_cpoint+0x44/0xbc [lkdtm]
[ 292.475099] lkdtm_module_init+0x220/0x27c [lkdtm]
[ 292.480120] do_one_initcall+0x1c8/0x40c
[ 292.484244] do_init_module+0xb4/0x330
[ 292.488185] load_module+0x3404/0x34c8
[ 292.492126] __do_sys_finit_module+0x11c/0x13c
[ 292.496787] __arm64_sys_finit_module+0x4c/0x5c
[ 292.501539] el0_svc_common.constprop.0+0x160/0x200
[ 292.506647] do_el0_svc+0x74/0xa4
[ 292.510139] el0_sync_handler+0xc4/0x104
[ 292.514257] el0_sync+0x140/0x180

Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Petr Mladek <pmladek@xxxxxxxx>
Cc: Jisheng Zhang <Jisheng.Zhang@xxxxxxxxxxxxx>
Cc: Valdis Kletnieks <valdis.kletnieks@xxxxxx>
Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Signed-off-by: Eugeniu Rosca <erosca@xxxxxxxxxxxxxx>
---
kernel/watchdog.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index b6b1f54a7837..cbe7e2c4c5fb 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -428,6 +428,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
}
}

+ console_verbose_start();
+
pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
smp_processor_id(), duration,
current->comm, task_pid_nr(current));
@@ -453,6 +455,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
if (softlockup_panic)
panic("softlockup: hung tasks");
__this_cpu_write(soft_watchdog_warn, true);
+
+ console_verbose_end();
} else
__this_cpu_write(soft_watchdog_warn, false);

--
2.25.0