Re: [PATCH tip/core/rcu 6/7] rcu: Drive quiescent-state-forcingdelay from HZ

From: Paul E. McKenney
Date: Tue May 28 2013 - 21:29:39 EST


On Tue, May 28, 2013 at 12:07:42PM +0200, Ingo Molnar wrote:
>
> * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > On Wed, May 15, 2013 at 11:20:55AM +0200, Ingo Molnar wrote:
> > >
> > > * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > > rcu: Fix comparison sense in rcu_needs_cpu()
> > > >
> > > > Commit c0f4dfd4f (rcu: Make RCU_FAST_NO_HZ take advantage of numbered
> > > > callbacks) introduced a bug that can result in excessively long grace
> > > > periods. This bug reverse the senes of the "if" statement checking
> > > > for lazy callbacks, so that RCU takes a lazy approach when there are
> > > > in fact non-lazy callbacks. This can result in excessive boot, suspend,
> > > > and resume times.
> > > >
> > > > This commit therefore fixes the sense of this "if" statement.
> > > >
> > > > Reported-by: Borislav Petkov <bp@xxxxxxxxx>
> > > > Reported-by: Bj?rn Mork <bjorn@xxxxxxx>
> > > > Reported-by: Joerg Roedel <joro@xxxxxxxxxx>
> > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > > >
> > > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> > > > index 170814d..6d939a6 100644
> > > > --- a/kernel/rcutree_plugin.h
> > > > +++ b/kernel/rcutree_plugin.h
> > > > @@ -1667,7 +1667,7 @@ int rcu_needs_cpu(int cpu, unsigned long *dj)
> > > > rdtp->last_accelerate = jiffies;
> > > >
> > > > /* Request timer delay depending on laziness, and round. */
> > > > - if (rdtp->all_lazy) {
> > > > + if (!rdtp->all_lazy) {
> > > > *dj = round_up(rcu_idle_gp_delay + jiffies,
> > > > rcu_idle_gp_delay) - jiffies;
> > >
> > > Neat - could this explain sporadic long (but not infinite) boot times with
> > > NOHZ_FULL?
> > >
> > > We changed HZ to be at least 1 Hz pretty recently, which might have worked
> > > around this bug.
> >
> > Quite possibly...
> >
> > Of course, I don't see the boot slowdowns in my testing. :-/
>
> They were pretty sporadic and only popped up (and down) during randconfig
> testing. Simple unrelated changes to the .config made them go away -
> heisenbugs.

I can believe that... The system has to be very quiet for this bug to
significantly slow down boot. Interrupts scattered across CPUs (for
example) would tend to force RCU's state machine forward.

Thanx, Paul

> Thanks,
>
> Ingo
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/