Re: [syzbot] upstream boot error: WARNING in __context_tracking_enter

From: Dmitry Vyukov
Date: Tue Apr 13 2021 - 01:14:23 EST


On Mon, Mar 22, 2021 at 6:22 PM Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> Hi Russell,
>
> On Fri, Mar 19, 2021 at 10:10:43AM +0000, Russell King - ARM Linux admin wrote:
> > On Fri, Mar 19, 2021 at 10:54:48AM +0100, Dmitry Vyukov wrote:
> > > .On Fri, Mar 19, 2021 at 10:44 AM syzbot
> > > <syzbot+f09a12b2c77bfbbf51bd@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit: 8b12a62a Merge tag 'drm-fixes-2021-03-19' of git://anongit..
> > > > git tree: upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=17e815aed00000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=cfeed364fc353c32
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=f09a12b2c77bfbbf51bd
> > > > userspace arch: arm
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+f09a12b2c77bfbbf51bd@xxxxxxxxxxxxxxxxxxxxxxxxx
> > >
> > >
> > > +Mark, arm
> > > It did not get far with CONFIG_CONTEXT_TRACKING_FORCE (kernel doesn't boot).
> >
> > It seems that the path:
> >
> > context_tracking_user_enter()
> > user_enter()
> > context_tracking_enter()
> > __context_tracking_enter()
> > vtime_user_enter()
> >
> > expects preemption to be disabled. It effectively is, because local
> > interrupts are disabled by context_tracking_enter().
> >
> > However, the requirement for preemption to be disabled is not
> > documented... so shrug. Maybe someone can say what the real requirements
> > are here.
>
> From dealing with this recently on arm64, theis is a bit messy. To
> handle this robustly we need to do a few things in sequence, including
> using the *_irqoff() variants of the context_tracking_user_*()
> functions.
>
> I wrote down the constraints in commit:
>
> 23529049c6842382 ("arm64: entry: fix non-NMI user<->kernel transitions")
>
> For user->kernel transitions, the arch code needs the following sequence
> before invoking arbitrary kernel C code:
>
> lockdep_hardirqs_off(CALLER_ADDR0);
> user_exit_irqoff();
> trace_hardirqs_off_finish();
>
> For kernel->user transitions, the arch code needs the following sequence
> once it will no longer invoke arbitrary kernel C code, just before
> returning to userspace:
>
> trace_hardirqs_on_prepare();
> lockdep_hardirqs_on_prepare(CALLER_ADDR0);
> user_enter_irqoff();
> lockdep_hardirqs_on(CALLER_ADDR0);

Hi Russell,

Does Mark's comment make sense to you?
lockdep_assert_preemption_disabled() also checks "&&
this_cpu_read(hardirqs_enabled)", so is it that we also need hardirq's
disabled around user_enter/exit?
This issue currently prevents ARM boot on syzbot.