Re: [syzbot] BUG: unable to handle kernel paging request in get_desc

From: Dmitry Vyukov
Date: Fri Nov 04 2022 - 14:45:27 EST


On Fri, 4 Nov 2022 at 11:41, Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Fri, Nov 04, 2022, Dmitry Vyukov wrote:
> > On Fri, 4 Nov 2022 at 10:39, 'Sean Christopherson' via syzkaller-bugs
> > <syzkaller-bugs@xxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Fri, Nov 04, 2022, Sean Christopherson wrote:
> > > > On Fri, Nov 04, 2022, Dmitry Vyukov wrote:
> > > > > On Fri, 4 Nov 2022 at 08:28, 'Sean Christopherson' via syzkaller-bugs
> > > > > <syzkaller-bugs@xxxxxxxxxxxxxxxx> wrote:
> > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > > > Reported-by: syzbot+ffb4f000dc2872c93f62@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > > > > >
> > > > > > > BUG: unable to handle page fault for address: fffffbc5a1c22e00
> > > > > > > #PF: supervisor read access in kernel mode
> > > > > > > #PF: error_code(0x0000) - not-present page
> > > > > > > PGD 23ffe4067 P4D 23ffe4067 PUD 13ff2d067 PMD 13ff2c067 PTE 0
> > > > > > > Oops: 0000 [#1] PREEMPT SMP KASAN
> > > > > > > CPU: 0 PID: 5368 Comm: syz-executor.2 Not tainted 6.1.0-rc3-next-20221103-syzkaller #0
> > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022
> > > > > > > RIP: 0010:get_desc+0x128/0x460 arch/x86/lib/insn-eval.c:660
> > > > > >
> > > > > > I'm pretty sure this is the same thing as
> > > > > >
> > > > > > BUG: unable to handle kernel paging request in vmx_handle_exit_irqoff
> > > > > >
> > > > > > I'll verify and get a patch posted shortly.
> > > > >
> > > > > This repro does not create any VMs, it's just:
> > > > >
> > > > > iopl(0x3)
> > > > > rt_sigreturn()
> > > > >
> > > > > Do you still think it's related to the vmx_handle_exit_irqoff issue?
> > > >
> > > > Yes, the issue is that the shadow for the read-only IDT mapping in the CPU entry
> > > > area isn't populated (commit 9fd429c28073 ("x86/kasan: Map shadow for percpu pages
> > > > on demand") is to blame). The bug manifests anytime software manually does an IDT
> > > > lookup.
> > >
> > > Hrm, but the lookup is into the GDT, not the IDT, and I haven't been able to reproduce
> > > this one. I'll leave it open for now.
> >
> > The repro calls rt_sigreturn() w/o actually setting up the signal
> > frame (mcontext, etc). So I assume the kernel will restore completely
> > bogus/random user-space mcontext. The data it reads from the stack may
> > be uninit or depend on the compiler, etc.
> >
> > As the result it should get completely random segment selector here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/arch/x86/lib/insn-eval.c?id=81214a573d19ae2fa5b528286ba23cd1cb17feec#n725
> >
> > Can it be out-of-bounds or something?
>
> The lookup is on CS.base (I trimmed the stack in my first reply) as part of the
> IOPL emulation to see if userspace is attempting CLI or STI, so it's not related
> to the sigframe.
>
> insn_get_seg_base arch/x86/lib/insn-eval.c:725 [inline]
> insn_get_effective_ip+0x187/0x1f0 arch/x86/lib/insn-eval.c:1476
> fixup_iopl_exception+0xd0/0x190 arch/x86/kernel/traps.c:627
> __exc_general_protection arch/x86/kernel/traps.c:752 [inline]
> exc_general_protection+0x176/0x210 arch/x86/kernel/traps.c:728
> asm_exc_general_protection+0x22/0x30 arch/x86/include/asm/idtentry.h:564
> RIP: 0003:0x7f250f3abf8c
>
> It does look like some form out out-of-bounds selector though. The offset in the
> splat suggests CS.sel is something way above __USER_CS, which would explain why
> insn_get_effective_ip() is doing a lookup in the first place (CS.base is assumed
> to be 0 if userspace is in 64-bit mode, user_64bit_mode() is true if CS == __USER_CS)),
> I just can't figure out how that tiny reproducer is getting a bad CS. And the above
> RIP strongly suggests userspace is indeed in 64-bit mode.

My understanding is that rt_sigreturn() restores complete user context
from the info stored on the stack.
Normally signal delivery will store that info on the stack first. But
in this case there is no signal delivery, so rt_sigreturn() reads
complete garbage from the stack and restores it into the context. I
assume this can setup any non-sense CS and maybe even pretend this is
not normal x86_64 mode (?).