Re: [PATCH 01/13] rcu/nocb: Fix potential missed nocb_timer rearm

From: Frederic Weisbecker
Date: Wed Mar 03 2021 - 06:32:07 EST


On Tue, Mar 02, 2021 at 06:06:43PM -0800, Paul E. McKenney wrote:
> On Wed, Mar 03, 2021 at 02:35:33AM +0100, Frederic Weisbecker wrote:
> > On Tue, Mar 02, 2021 at 10:17:29AM -0800, Paul E. McKenney wrote:
> > > On Tue, Mar 02, 2021 at 01:34:44PM +0100, Frederic Weisbecker wrote:
> > >
> > > OK, how about if I queue a temporary commit (shown below) that just
> > > calls out the first scenario so that I can start testing, and you get
> > > me more detail on the second scenario? I can then update the commit.
> >
> > Sure, meanwhile here is an attempt for a nocb_bypass_timer based
> > scenario, it's overly hairy and perhaps I picture more power
> > in the hands of callbacks advancing on nocb_cb_wait() than it
> > really has:
>
> Thank you very much!
>
> I must defer looking through this in detail until I am more awake,
> but I do very much like the fine-grained exposition.
>
> Thanx, Paul
>
> > 0. CPU 0's ->nocb_cb_kthread just called rcu_do_batch() and
> > executed all the ready callbacks. Its segcblist is now
> > entirely empty. It's preempted while calling local_bh_enable().
> >
> > 1. A new callback is enqueued on CPU 0 with IRQs enabled. So
> > the ->nocb_gp_kthread for CPU 0-2's is awaken. Then a storm
> > of callbacks enqueue follows on CPU 0 and even reaches the
> > bypass queue. Note that ->nocb_gp_kthread is also associated
> > with CPU 0.
> >
> > 2. CPU 0 queues one last bypass callback.
> >
> > 3. The ->nocb_gp_kthread wakes up and associates a grace period
> > with the whole queue of regular callbacks on CPU 0. It also
> > tries to flush the bypass queue of CPU 0 but the bypass lock
> > is contended due to the concurrent enqueuing on the previous
> > step 2, so the flush fails.
> >
> > 4. This ->nocb_gp_kthread arms its ->nocb_bypass_timer and goes
> > to sleep waiting for the end of this future grace period.
> >
> > 5. This grace period elapses before the ->nocb_bypass_timer timer
> > fires. This is normally improbably given that the timer is set
> > for only two jiffies, but timers can be delayed. Besides, it
> > is possible that kernel was built with CONFIG_RCU_STRICT_GRACE_PERIOD=y.
> >
> > 6. The grace period ends, so rcu_gp_kthread awakens the
> > ->nocb_gp_kthread but it doesn't get a chance to run on a CPU
> > before a while.
> >
> > 7. CPU 0's ->nocb_cb_kthread get back to the CPU after its preemption.
> > As it notices the new completed grace period, it advances the callbacks
> > and executes them. Then it gets preempted again on local_bh_enabled().
> >
> > 8. A new callback enqueue on CPU 0 flushes itself the bypass queue
> > because CPU 0's ->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy.
> >
> > 9. CPUs from other ->nocb_gp_kthread groups (above CPU 2) initiate and
> > elapse a few grace periods. CPU 0's ->nocb_gp_kthread still hasn't
> > got an opportunity to run on a CPU and its ->nocb_bypass_timer still
> > hasn't fired.
> >
> > 10. CPU 0's ->nocb_cb_kthread wakes up from preemption. It notices the
> > new grace periods that have elapsed, advance all the callbacks and
> > executes them. Then it goes to sleep waiting for invocable
> > callbacks.

I'm just not so sure about the above point 10. Even though a few grace periods
have elapsed, the callback queued in 8 is in RCU_NEXT_TAIL at this
point. Perhaps one more grace period is necessary after that.

Anyway, I need to be more awake as well before checking that again.