Re: [PATCH] time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint

From: Paul E. McKenney
Date: Fri Jun 21 2019 - 13:50:51 EST


On Fri, Jun 21, 2019 at 10:41:04AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 21, 2019 at 06:34:14AM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 21, 2019 at 02:29:27PM +0200, Peter Zijlstra wrote:
> > > On Fri, Jun 21, 2019 at 05:16:30AM -0700, Paul E. McKenney wrote:
> > > > A pair of full hangs at boot (TASKS03 and TREE04), no console output
> > > > whatsoever. Not sure how these changes could cause that, but suspicion
> > > > falls on sched_tick_offload_init(). Though even that is a bit strange
> > > > because if so, why didn't TREE01 and TREE07 also hang? Again, looking
> > > > into it.
> > >
> > > Pesky details ;-)
> >
> > And backing out to the earlier patch removes the hangs, though statistical
> > insignificance and all that.
>
> And purists might argue that four failures out of four attempts does not
> constitute true statistical significance, but too bad. If I interpose
> a twork pointer in sched_tick_offload_init()'s initialization, it seems
> to work fine, give or take lack of statistical significance. This is
> surprising, so I am rerunning with added parentheses in the atomic_set()
> expression.

Huh. This works, albeit only once:

int __init sched_tick_offload_init(void)
{
struct tick_work *twork;
int cpu;

tick_work_cpu = alloc_percpu(struct tick_work);
BUG_ON(!tick_work_cpu);
for_each_possible_cpu(cpu) {
twork = per_cpu_ptr(tick_work_cpu, cpu);
atomic_set(&twork->state, TICK_SCHED_REMOTE_OFFLINE);
}

return 0;
}

This does not work:

int __init sched_tick_offload_init(void)
{
int cpu;

tick_work_cpu = alloc_percpu(struct tick_work);
BUG_ON(!tick_work_cpu);
for_each_possible_cpu(cpu)
atomic_set(&(per_cpu(tick_work_cpu, cpu)->state), TICK_SCHED_REMOTE_OFFLINE);

return 0;
}

I will run more tests on the one that worked only once. In the meantime,
feel free to tell me what stupid thing I did with the parentheses.

Thanx, Paul