Re: [PATCH] arm64/trap: fix broken ct->nmi_nesting when die() is called in a kthread

From: Mark Rutland
Date: Tue Jun 03 2025 - 09:46:31 EST


On Tue, Jun 03, 2025 at 12:14:18PM +0100, Yeoreum Yun wrote:
> > On Mon, Jun 02, 2025 at 06:50:53PM +0100, Yeoreum Yun wrote:
> > > So, what I think:
> > > 1. arm64_enter_el1_dbg() should ct_nmi_enter() as it is.
> > > 2. in bug_handler() while handling BUG_TYPE, add above ct_nmi_exit()
> > > conditional call.
> > > 3. DAIF.D and DAIF.A handling.
> >
> > No, that is not safe. In step 2, calling ct_nmi_exit() would undo *all*
> > of the ct_nmi_enter() logic, and may stop RCU from watching if the
> > exception was entered from some intermediate/inconsistent state.
>
> Yes if call ct_nmi_enter() without condition.
> But I imply with the condition check what I posted.
> if CT_NESTING_IRQ_NONIDLE,
> it wouldn't need call and that cpu can be watched by RCU.

I am not keen on conditionally calling ct_nmi_exit(), and would strongly
prefer to avoid that, regardless of where that lives in the flow.

I suspect that it would be bettter to triage the interrupted context
earlier, and rethink the way entry/exit works, but that's a much larger
bit of work and will take more thinking.

Mark.