Re: BUG: KASAN: stack-out-of-bounds in unwind_next_frame+0x1df5/0x2650

From: Josh Poimboeuf
Date: Wed Feb 03 2021 - 18:29:59 EST


On Wed, Feb 03, 2021 at 02:41:53PM -0800, Ivan Babrou wrote:
> On Wed, Feb 3, 2021 at 11:05 AM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> >
> > On Wed, Feb 03, 2021 at 09:46:55AM -0800, Ivan Babrou wrote:
> > > > Can you pretty please not line-wrap console output? It's unreadable.
> > >
> > > GMail doesn't make it easy, I'll send a link to a pastebin next time.
> > > Let me know if you'd like me to regenerate the decoded stack.
> > >
> > > > > edfd9b7838ba5e47f19ad8466d0565aba5c59bf0 is the first bad commit
> > > > > commit edfd9b7838ba5e47f19ad8466d0565aba5c59bf0
> > > >
> > > > Not sure what tree you're on, but that's not the upstream commit.
> > >
> > > I mentioned that it's a rebased core-static_call-2020-10-12 tag and
> > > added a link to the upstream hash right below.
> > >
> > > > > Author: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
> > > > > Date: Tue Aug 18 15:57:52 2020 +0200
> > > > >
> > > > > tracepoint: Optimize using static_call()
> > > > >
> > > >
> > > > There's a known issue with that patch, can you try:
> > > >
> > > > http://lkml.kernel.org/r/20210202220121.435051654@xxxxxxxxxxx
> > >
> > > I've tried it on top of core-static_call-2020-10-12 tag rebased on top
> > > of v5.9 (to make it reproducible), and the patch did not help. Do I
> > > need to apply the whole series or something else?
> >
> > Can you recreate with this patch, and add "unwind_debug" to the cmdline?
> > It will spit out a bunch of stack data.
>
> Here's the three I'm building:
>
> * https://github.com/bobrik/linux/tree/ivan/static-call-5.9
>
> It contains:
>
> * v5.9 tag as the base
> * static_call-2020-10-12 tag
> * dm-crypt patches to reproduce the issue with KASAN
> * x86/unwind: Add 'unwind_debug' cmdline option
> * tracepoint: Fix race between tracing and removing tracepoint
>
> The very same issue can be reproduced on 5.10.11 with no patches,
> but I'm going with 5.9, since it boils down to static call changes.
>
> Here's the decoded stack from the kernel with unwind debug enabled:
>
> * https://gist.github.com/bobrik/ed052ac0ae44c880f3170299ad4af56b
>
> See my first email for the exact commands that trigger this.

Thanks. Do you happen to have the original dmesg, before running it
through the post-processing script?


I assume you're using decode_stacktrace.sh? It could use some
improvement, it's stripping the function offset.

Also spaces are getting inserted in odd places, messing the alignment.

[ 137.291837][ C0] ffff88809c409858: d7c4f3ce817a1700 (0xd7c4f3ce817a1700)
[ 137.291837][ C0] ffff88809c409860: 0000000000000000 (0x0)
[ 137.291839][ C0] ffff88809c409868: 00000000ffffffff (0xffffffff)
[ 137.291841][ C0] ffff88809c409870: ffffffffa4f01a52 unwind_next_frame (arch/x86/kernel/unwind_orc.c:380 arch/x86/kernel/unwind_orc.c:553)
[ 137.291843][ C0] ffff88809c409878: ffffffffa4f01a52 unwind_next_frame (arch/x86/kernel/unwind_orc.c:380 arch/x86/kernel/unwind_orc.c:553)
[ 137.291844][ C0] ffff88809c409880: ffff88809c409ac8 (0xffff88809c409ac8)
[ 137.291845][ C0] ffff88809c409888: 0000000000000086 (0x86)

--
Josh