Re: [PATCH] time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint

From: Frederic Weisbecker
Date: Mon Jun 24 2019 - 20:43:27 EST


On Mon, Jun 24, 2019 at 04:44:22PM -0700, Paul E. McKenney wrote:
> On Tue, Jun 25, 2019 at 01:12:23AM +0200, Frederic Weisbecker wrote:
> > On Fri, Jun 21, 2019 at 04:46:02PM -0700, Paul E. McKenney wrote:
> > > @@ -3097,13 +3126,21 @@ static void sched_tick_remote(struct work_struct *work)
> > > /*
> > > * Run the remote tick once per second (1Hz). This arbitrary
> > > * frequency is large enough to avoid overload but short enough
> > > - * to keep scheduler internal stats reasonably up to date.
> > > + * to keep scheduler internal stats reasonably up to date. But
> > > + * first update state to reflect hotplug activity if required.
> > > */
> > > + os = atomic_read(&twork->state);
> > > + if (os) {
> > > + WARN_ON_ONCE(os != TICK_SCHED_REMOTE_OFFLINING);
> > > + if (atomic_inc_not_zero(&twork->state))
> > > + return;
> >
> > Using inc makes me a bit nervous here. If we do so, we should somewhow
> > make sure that we never exceed a value higher than TICK_SCHED_REMOTE_OFFLINE
> > by accident.
> >
> > atomic_xchg() is probably a bit costlier but also safer as it allows
> > us to check both the old and the new value. That path shouldn't be critically fast
> > after all.
>
> It would need to be cmpxchg() to avoid messing with the state if
> the state were somehow TICK_SCHED_REMOTE_RUNNING, right?

Ah indeed! Nevermind, let's keep things as they are then.

> > > + }
> > > queue_delayed_work(system_unbound_wq, dwork, HZ);
> > > }
> > >
> > > static void sched_tick_start(int cpu)
> > > {
> > > + int os;
> > > struct tick_work *twork;
> > >
> > > if (housekeeping_cpu(cpu, HK_FLAG_TICK))
> > > @@ -3112,15 +3149,20 @@ static void sched_tick_start(int cpu)
> > > WARN_ON_ONCE(!tick_work_cpu);
> > >
> > > twork = per_cpu_ptr(tick_work_cpu, cpu);
> > > - twork->cpu = cpu;
> > > - INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
> > > - queue_delayed_work(system_unbound_wq, &twork->work, HZ);
> > > + os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
> > > + WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
> >
> > See if we use atomic_inc(), we would need to also WARN(os > TICK_SCHED_REMOTE_OFFLINE).
>
> How about if I put that WARN() between the atomic_inc_not_zero() and
> the return, presumably also adding braces?

Yeah, unfortunately there is no atomic_add_not_zero_return().
I guess we can live with a check using atomic_read(). In the best
case it returns the fresh increment, otherwise it should be REMOTE_RUNNING.

In any case the (os > TICK_SCHED_REMOTE_OFFLINE) check applies.

Thanks.