Re: [RFC] rcu: Warn that rcu ktheads cannot be spawned

From: Byungchul Park
Date: Mon Jun 24 2019 - 22:41:51 EST


On Mon, Jun 24, 2019 at 10:25:51AM -0700, Paul E. McKenney wrote:
> On Mon, Jun 24, 2019 at 12:46:24PM -0400, Joel Fernandes wrote:
> > On Mon, Jun 24, 2019 at 05:27:32PM +0900, Byungchul Park wrote:
> > > Hello rcu folks,
> > >
> > > I thought it'd better to announce it if those spawnings fail because of
> > > !rcu_scheduler_fully_active.
> > >
> > > Of course, with the current code, it never happens though.
> > >
> > > Thoughts?
> >
> > It seems in the right spirit, but with your patch a warning always fires.
> > rcu_prepare_cpu() is called multiple times, once from rcu_init() and then
> > from hotplug paths.
> >
> > Warning splat stack looks like:
> >
> > [ 0.398767] Call Trace:
> > [ 0.398775] rcu_init+0x6aa/0x724
> > [ 0.398779] start_kernel+0x220/0x4a2
> > [ 0.398780] ? copy_bootdata+0x12/0xac
> > [ 0.398782] secondary_startup_64+0xa4/0xb0
>
> Thank you both, and I will remove this from my testing queue.
>
> As Joel says, this is called at various points in the boot sequence, not
> all of which are far enough along to support spawning kthreads.
>
> The real question here is "What types of bugs are we trying to defend
> against?" But keeping in mind existing diagnostics. For example, are
> there any kthreads for which a persistent failure to spawn would not
> emit any error message. My belief is that any such persistent failure
> would result in either an in-kernel diagnostic or an rcutorture failure,
> but I might well be missing something.
>
> Thoughts? Or, more to the point, tests demonstrating silence in face
> of such a persistent failure?

You are right. There wouldn't be a persistent failure because the path
turning cpus on always tries to spawn them, *even* in case that the
booting sequence is wrong. The current code anyway goes right though.

I thought a hole can be there if the code changes so that those kthreads
cannot be spawned until the cpu being up, which is the case I was
interested in. Again, it's gonna never happen with the current code
because it spawns them after setting rcu_scheduler_fully_active to 1 in
rcu_spawn_gp_kthead().

And I wrongly thought you placed the rcu_scheduler_fully_active check on
spawning just in case. But it seems to be not the case.

So I'd better stop working on the warning patch. :) Instead, please
check the following trivial fix.

Thanks,
Byungchul

---8<---