Re: [syzbot] INFO: rcu detected stall in syscall_exit_to_user_mode

From: Dmitry Vyukov
Date: Wed Sep 15 2021 - 05:14:23 EST


On Wed, 15 Sept 2021 at 10:57, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> On Tue, Sep 14 2021 at 20:00, Dmitry Vyukov wrote:
> > On Tue, 14 Sept 2021 at 16:58, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >> Now what happens when the mac80211 callback rearms the timer so it
> >> expires immediately again:
> >>
> >> hrtimer_forward(&data->beacon_timer, hrtimer_get_expires(timer),
> >> ns_to_ktime(bcn_int * NSEC_PER_USEC));
> >>
> >> bcn is a user space controlled value. Now lets assume that bcn_int is <=1,
> >> which would certainly cause the loop in hrtimer_run_queues() to keeping
> >> looping forever.
> >>
> >> That should be easy to verify by implementing a simple test which
> >> reschedules a hrtimer from the callback with a expiry time close to now.
> >>
> >> Not today as I'm about to head home to fire up the pizza oven.
> >
> > This question definitely shouldn't take priority over the pizza. But I
> > think I saw this "rearm a timer with a user-controlled value without
> > any checks" pattern lots of times and hangs are inherently harder to
> > localize and reproduce. So I wonder if it makes sense to add a debug
> > config that would catch such cases right when the timer is set up
> > (issue a WARNING)?
>
> Yes and no. It's hard to differentiate between a valid short expiry
> rearm and something which is caused by unchecked values. I have some
> ideas but all of them are expensive and therefore probably debug
> only. Which is actually better than nothing :)
>
> > However, for automated testing there is the usual question of
> > balancing between false positives and false negatives. The check
> > should not produce false positives, but at the same time it should
> > catch [almost] all actual stalls so that they don't manifest as
> > duplicate stall reports.
>
> Right. The problem could be even there with checked values:
>
> start_timer(1ms)
> timer_expires()
> callback()
> forward_timer(timer, now, period(1ms));
>
> which might be perfectly fine with a production kernel as it leaves
> enough time to make overall progress.
>
> Now with a full debug kernel with all bells and whistels that callback
> might just run into this situation:
>
> start_timer(1ms) T0
> timer_expires() T1
> callback()
> do_stuff()
> forward_timer(timer, TNOW, period(1ms));
>
>
> T1 - T0 = 1.001ms
> TNOW - T1 = 0.998 ms
>
> So the forward will just rearm it to T0 + 2ms which means it expires in
> 1us.
>
> > If I understand it correctly the timer is not actually set up as
> > periodic, but rather each callback invocation arms it again. Setting
> > up a timer for 1 ns _once_ (or few times) is probably fine (right?),
> > so the check needs to be somewhat more elaborate and detect "infinite"
> > rearming.
>
> Yes.
>
> That made me actually look at that mac80211_hwsim callback again.
>
> hrtimer_forward(&data->beacon_timer, hrtimer_get_expires(timer),
> ns_to_ktime(bcn_int * NSEC_PER_USEC));
>
> So what this does is really wrong because it tries to schedule the timer
> on the theoretical periodic timeline. Which goes really south once the
> timer is late or the callback execution took longer than the
> period. Hypervisors scheduling out a VCPU at the wrong place will do
> that for you nicely.

Nice!

You mentioned that hrtimer_run_queues() may not return. Does it mean
that it can just loop executing the same re-armed callback again and
again? Maybe then the debug check condition should be that
hrtimer_run_queues() runs the same callback more than N times w/o
returning?


> What this actually should use is hrtimer_forward_now() which prevents
> that problem because it will forward the timer in the periodic schedule
> beyond now. That won't prevent the above corner case, but I doubt you
> can create an endless loop with that scenario as easy as you can with
> trying to catch up on your theoretical timeline by using the previous
> expiry time as a base for the forward. Patch below.
>
> /me goes off to audit hrtimer_forward() usage. Sigh...
>
> After that figure out ways to debug or even prevent this. More sigh...
>
> Thanks,
>
> tglx
> ---
> drivers/net/wireless/mac80211_hwsim.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> --- a/drivers/net/wireless/mac80211_hwsim.c
> +++ b/drivers/net/wireless/mac80211_hwsim.c
> @@ -1867,8 +1867,8 @@ mac80211_hwsim_beacon(struct hrtimer *ti
> bcn_int -= data->bcn_delta;
> data->bcn_delta = 0;
> }
> - hrtimer_forward(&data->beacon_timer, hrtimer_get_expires(timer),
> - ns_to_ktime(bcn_int * NSEC_PER_USEC));
> + hrtimer_forward_now(&data->beacon_timer,
> + ns_to_ktime(bcn_int * NSEC_PER_USEC));
> return HRTIMER_RESTART;
> }
>