Re: [PATCH v6 4/4] kdb: Don't back trace on a cpu that didn't round up

From: Daniel Thompson
Date: Mon Dec 03 2018 - 11:03:26 EST


On Tue, Nov 27, 2018 at 09:38:39AM -0800, Douglas Anderson wrote:
> If you have a CPU that fails to round up and then run 'btc' you'll end
> up crashing in kdb becaue we dereferenced NULL. Let's add a check.
> It's wise to also set the task to NULL when leaving the debugger so
> that if we fail to round up on a later entry into the debugger we
> won't backtrace a stale task.
>
> Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>

Acked-by: Daniel Thompson <daniel.thompson@xxxxxxxxxx>

> ---
>
> Changes in v6: None
> Changes in v5: None
> Changes in v4:
> - Also clear out .debuggerinfo.
> - Also clear out .debuggerinfo and .task for the master.
> - Remove clearing out in kdb_stub for offline CPUs; it's now redundant.
>
> Changes in v3:
> - Don't back trace on a cpu that didn't round up new for v3.
>
> Changes in v2: None
>
> kernel/debug/debug_core.c | 4 ++++
> kernel/debug/kdb/kdb_bt.c | 11 ++++++++++-
> kernel/debug/kdb/kdb_debugger.c | 7 -------
> 3 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 1fb8b239e567..5cc608de6883 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -592,6 +592,8 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
> arch_kgdb_ops.correct_hw_break();
> if (trace_on)
> tracing_on();
> + kgdb_info[cpu].debuggerinfo = NULL;
> + kgdb_info[cpu].task = NULL;
> kgdb_info[cpu].exception_state &=
> ~(DCPU_WANT_MASTER | DCPU_IS_SLAVE);
> kgdb_info[cpu].enter_kgdb--;
> @@ -724,6 +726,8 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
> if (trace_on)
> tracing_on();
>
> + kgdb_info[cpu].debuggerinfo = NULL;
> + kgdb_info[cpu].task = NULL;
> kgdb_info[cpu].exception_state &=
> ~(DCPU_WANT_MASTER | DCPU_IS_SLAVE);
> kgdb_info[cpu].enter_kgdb--;
> diff --git a/kernel/debug/kdb/kdb_bt.c b/kernel/debug/kdb/kdb_bt.c
> index 7921ae4fca8d..7e2379aa0a1e 100644
> --- a/kernel/debug/kdb/kdb_bt.c
> +++ b/kernel/debug/kdb/kdb_bt.c
> @@ -186,7 +186,16 @@ kdb_bt(int argc, const char **argv)
> kdb_printf("btc: cpu status: ");
> kdb_parse("cpu\n");
> for_each_online_cpu(cpu) {
> - sprintf(buf, "btt 0x%px\n", KDB_TSK(cpu));
> + void *kdb_tsk = KDB_TSK(cpu);
> +
> + /* If a CPU failed to round up we could be here */
> + if (!kdb_tsk) {
> + kdb_printf("WARNING: no task for cpu %ld\n",
> + cpu);
> + continue;
> + }
> +
> + sprintf(buf, "btt 0x%px\n", kdb_tsk);
> kdb_parse(buf);
> touch_nmi_watchdog();
> }
> diff --git a/kernel/debug/kdb/kdb_debugger.c b/kernel/debug/kdb/kdb_debugger.c
> index 15e1a7af5dd0..53a0df6e4d92 100644
> --- a/kernel/debug/kdb/kdb_debugger.c
> +++ b/kernel/debug/kdb/kdb_debugger.c
> @@ -118,13 +118,6 @@ int kdb_stub(struct kgdb_state *ks)
> kdb_bp_remove();
> KDB_STATE_CLEAR(DOING_SS);
> KDB_STATE_SET(PAGER);
> - /* zero out any offline cpu data */
> - for_each_present_cpu(i) {
> - if (!cpu_online(i)) {
> - kgdb_info[i].debuggerinfo = NULL;
> - kgdb_info[i].task = NULL;
> - }
> - }
> if (ks->err_code == DIE_OOPS || reason == KDB_REASON_OOPS) {
> ks->pass_exception = 1;
> KDB_FLAG_SET(CATASTROPHIC);
> --
> 2.20.0.rc0.387.gc7a69e6b6c-goog
>