Re: [PATCH 1/2] x86/unwind/orc: recheck address range after stack info was updated

From: Josh Poimboeuf
Date: Fri Apr 15 2022 - 22:06:42 EST


On Tue, Apr 12, 2022 at 12:08:37PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 12, 2022 at 10:40:03AM +0300, Dmitry Monakhov wrote:
> > get_stack_info() detects stack type only by begin address, so we must
> > check that address range in question is fully covered by detected stack
> >
> > Otherwise following crash is possible:
> > -> unwind_next_frame
> > case ORC_TYPE_REGS:
> > if (!deref_stack_regs(state, sp, &state->ip, &state->sp))
> > -> deref_stack_regs
> > -> stack_access_ok <- here addr is inside stack range, but addr+len-1 is not, but we still exit with success
> > *ip = READ_ONCE_NOCHECK(regs->ip); <- Here we hit stack guard fault
> > OOPS LOG:
> > <0>[ 1941.845743] BUG: stack guard page was hit at 000000000dd984a2 (stack is 00000000d1caafca..00000000613712f0)
>
>
> > <4>[ 1941.845751] get_perf_callchain+0x10d/0x280
> > <4>[ 1941.845751] perf_callchain+0x6e/0x80
> > <4>[ 1941.845752] perf_prepare_sample+0x87/0x540
> > <4>[ 1941.845752] perf_event_output_forward+0x31/0x90
> > <4>[ 1941.845753] __perf_event_overflow+0x5a/0xf0
> > <4>[ 1941.845754] perf_ibs_handle_irq+0x340/0x5b0
> > <4>[ 1941.845757] perf_ibs_nmi_handler+0x34/0x60
> > <4>[ 1941.845757] nmi_handle+0x79/0x190
>
> Urgh, this is another instance of trying to unwind an IP that no longer
> matches the stack.
>
> Fixing the unwinder bug is good, but arguable we should also fix this
> IBS stuff, see 6cbc304f2f36 ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")

I remember that nastiness well. So it's still broken? Or is this a
regression? Maybe we wouldn't notice it except for this triggered
unwinder bug?

--
Josh