Re: [PATCH v5 1/2] arm64: entry: Skip single stepping into interrupt handlers

From: Ard Biesheuvel
Date: Fri Jan 27 2023 - 06:24:21 EST


On Thu, 26 Jan 2023 at 14:40, Will Deacon <will@xxxxxxxxxx> wrote:
>
> On Mon, Dec 19, 2022 at 03:54:51PM +0530, Sumit Garg wrote:
> > Currently on systems where the timer interrupt (or any other
> > fast-at-human-scale periodic interrupt) is active then it is impossible
> > to step any code with interrupts unlocked because we will always end up
> > stepping into the timer interrupt instead of stepping the user code.
> >
> > The common user's goal while single stepping is that when they step then
> > the system will stop at PC+4 or PC+I for a branch that gets taken
> > relative to the instruction they are stepping. So, fix broken single step
> > implementation via skipping single stepping into interrupt handlers.
> >
> > The methodology is when we receive an interrupt from EL1, check if we
> > are single stepping (pstate.SS). If yes then we save MDSCR_EL1.SS and
> > clear the register bit if it was set. Then unmask only D and leave I set.
> > On return from the interrupt, set D and restore MDSCR_EL1.SS. Along with
> > this skip reschedule if we were stepping.
> >
> > Suggested-by: Will Deacon <will@xxxxxxxxxx>
> > Signed-off-by: Sumit Garg <sumit.garg@xxxxxxxxxx>
> > Tested-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
> > ---
> > arch/arm64/kernel/entry-common.c | 22 ++++++++++++++++++++--
> > 1 file changed, 20 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> > index cce1167199e3..688d1ef8e864 100644
> > --- a/arch/arm64/kernel/entry-common.c
> > +++ b/arch/arm64/kernel/entry-common.c
> > @@ -231,11 +231,15 @@ DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
> > #define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
> > #endif
> >
> > -static void __sched arm64_preempt_schedule_irq(void)
> > +static void __sched arm64_preempt_schedule_irq(struct pt_regs *regs)
> > {
> > if (!need_irq_preemption())
> > return;
> >
> > + /* Don't reschedule in case we are single stepping */
> > + if (!(regs->pstate & DBG_SPSR_SS))
> > + return;
>
> Hmm, isn't this the common case? PSTATE.SS will usually be clear, no?
>
> > * Note: thread_info::preempt_count includes both thread_info::count
> > * and thread_info::need_resched, and is not equivalent to
> > @@ -471,19 +475,33 @@ static __always_inline void __el1_irq(struct pt_regs *regs,
> > do_interrupt_handler(regs, handler);
> > irq_exit_rcu();
> >
> > - arm64_preempt_schedule_irq();
> > + arm64_preempt_schedule_irq(regs);
> >
> > exit_to_kernel_mode(regs);
> > }
> > +
> > static void noinstr el1_interrupt(struct pt_regs *regs,
> > void (*handler)(struct pt_regs *))
> > {
> > + unsigned long mdscr;
> > +
> > + /* Disable single stepping within interrupt handler */
> > + if (regs->pstate & DBG_SPSR_SS) {
> > + mdscr = read_sysreg(mdscr_el1);
> > + write_sysreg(mdscr & ~DBG_MDSCR_SS, mdscr_el1);
> > + }
>
> I think this will break the implicit handling of kernel {break,watch}points.
>
> Sadly, I think any attempts to workaround the issues here are likely just
> to push the problems around. We really need to overhaul the debug exception
> handling logic we have, which means I need to get back to writing up a
> proposal.
>

That would be much appreciated.

This patch makes single step debugging of VMs running under QEMU much
more useful (using QEMU gdbstub), for the same reason as with kdb, as
otherwise, there's a 50/50 chance (in my experience) that doing a
single step will take you the IRQ handler instead of to the next
instruction in program order.

FWIW, I tested this patch with that scenario, and it seems to work
much better, but not 100%: I still end up in the IRQ handler
occasionally, but considerably less often.