[PATCH 1/2] arm64: kgdb: Fix incorrect single stepping into the irq handler

From: Sumit Garg
Date: Mon Apr 11 2022 - 05:38:55 EST


PSTATE.I and PSTATE.D are very important for single step working.

Without disabling interrupt on local CPU, there is a chance of
interrupt occurrence in the period of exception return and start of
kgdb/kdb single-step, that result in wrongly single stepping into the
interrupt handler. And if D bit is set then, it results into undefined
exception and when it's handler enables dbg then single step exception
is generated, not as expected.

Currently when we execute single step in kdb/kgdb, we may see it jumps
to the irq_handler (where PSTATE.D is cleared) instead of the next
instruction. And a resume after single stepping into interrupt handler
sometimes leads to unbalanced locking:

[ 300.328300] WARNING: bad unlock balance detected!
[ 300.328608] 5.18.0-rc1-00016-g3e732ebf7316-dirty #6 Not tainted
[ 300.329058] -------------------------------------
[ 300.329298] sh/173 is trying to release lock (dbg_slave_lock) at:
[ 300.329718] [<ffffd57c951c016c>] kgdb_cpu_enter+0x7ac/0x820
[ 300.330029] but there are no more locks to release!
[ 300.330265]
[ 300.330265] other info that might help us debug this:
[ 300.330668] 4 locks held by sh/173:
[ 300.330891] #0: ffff4f5e454d8438 (sb_writers#3){.+.+}-{0:0}, at: vfs_write+0x98/0x204
[ 300.331735] #1: ffffd57c973bc2f0 (dbg_slave_lock){+.+.}-{2:2}, at: kgdb_cpu_enter+0x5b4/0x820
[ 300.332259] #2: ffffd57c973a9460 (rcu_read_lock){....}-{1:2}, at: kgdb_cpu_enter+0xe0/0x820
[ 300.332717] #3: ffffd57c973bc2a8 (dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x1ec/0x820

Add the save and restore work for single-step while enabling and
disabling single stepping to maintain the PSTATE.I and PSTATE.D carefully.

Fixes: 44679a4f142b ("arm64: KGDB: Add step debugging support")
Co-developed-by: Wei Li <liwei391@xxxxxxxxxx>
Signed-off-by: Wei Li <liwei391@xxxxxxxxxx>
Signed-off-by: Sumit Garg <sumit.garg@xxxxxxxxxx>
---
arch/arm64/kernel/kgdb.c | 33 +++++++++++++++++++++++++++++++--
1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index 2aede780fb80..653ad0d19f2f 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -15,6 +15,7 @@
#include <linux/kprobes.h>
#include <linux/sched/task_stack.h>

+#include <asm/daifflags.h>
#include <asm/debug-monitors.h>
#include <asm/insn.h>
#include <asm/patching.h>
@@ -171,6 +172,30 @@ static void kgdb_arch_update_addr(struct pt_regs *regs,
compiled_break = 0;
}

+/*
+ * Interrupts need to be disabled before single-step mode is set, and not
+ * re-enabled until after single-step mode ends. Without disabling interrupt
+ * on local CPU, there is a chance of interrupt occurrence in the period of
+ * exception return and start of kgdb/kdb single-step, that result in wrongly
+ * single stepping into the interrupt handler. Also, resume from single
+ * stepping the interrupt handler is risky as it sometimes leads to unbalanced
+ * locking.
+ */
+static DEFINE_PER_CPU(unsigned long, kgdb_ss_flags);
+
+static void kgdb_save_local_irqflag(struct pt_regs *regs)
+{
+ __this_cpu_write(kgdb_ss_flags, (regs->pstate & DAIF_MASK));
+ regs->pstate |= PSR_I_BIT;
+ regs->pstate &= ~PSR_D_BIT;
+}
+
+static void kgdb_restore_local_irqflag(struct pt_regs *regs)
+{
+ regs->pstate &= ~DAIF_MASK;
+ regs->pstate |= __this_cpu_read(kgdb_ss_flags);
+}
+
int kgdb_arch_handle_exception(int exception_vector, int signo,
int err_code, char *remcom_in_buffer,
char *remcom_out_buffer,
@@ -201,8 +226,10 @@ int kgdb_arch_handle_exception(int exception_vector, int signo,
/*
* Received continue command, disable single step
*/
- if (kernel_active_single_step())
+ if (kernel_active_single_step()) {
+ kgdb_restore_local_irqflag(linux_regs);
kernel_disable_single_step();
+ }

err = 0;
break;
@@ -222,8 +249,10 @@ int kgdb_arch_handle_exception(int exception_vector, int signo,
/*
* Enable single step handling
*/
- if (!kernel_active_single_step())
+ if (!kernel_active_single_step()) {
+ kgdb_save_local_irqflag(linux_regs);
kernel_enable_single_step(linux_regs);
+ }
err = 0;
break;
default:
--
2.25.1