Re: [tip: core/rcu] softirq: Don't try waking ksoftirqd before it has been spawned

From: Uladzislau Rezki
Date: Wed Apr 14 2021 - 05:00:46 EST


On Wed, Apr 14, 2021 at 09:13:22AM +0200, Sebastian Andrzej Siewior wrote:
> On 2021-04-12 11:36:45 [-0700], Paul E. McKenney wrote:
> > > Color me confused. I did not follow the discussion around this
> > > completely, but wasn't it agreed on that this rcu torture muck can wait
> > > until the threads are brought up?
> >
> > Yes, we can cause rcutorture to wait. But in this case, rcutorture
> > is just the messenger, and making it wait would simply be ignoring
> > the message. The message is that someone could invoke any number of
> > things that wait on a softirq handler's invocation during the interval
> > before ksoftirqd has been spawned.
>
> My memory on this is that the only user, that required this early
> behaviour, was kprobe which was recently changed to not need it anymore.
> Which makes the test as the only user that remains. Therefore I thought
> that this test will be moved to later position (when ksoftirqd is up and
> running) and that there is no more requirement for RCU to be completely
> up that early in the boot process.
>
> Did I miss anything?
>
Seems not. Let me wrap it up a bit though i may miss something:

1) Initially we had an issue with booting RISV because of:

36dadef23fcc ("kprobes: Init kprobes in early_initcall")

i.e. a developer decided to move initialization of kprobe at
early_initcall() phase. Since kprobe uses synchronize_rcu_tasks()
a system did not boot due to the fact that RCU-tasks were setup
at core_initcall() step. It happens later in this chain.

To address that issue, we had decided to move RCU-tasks setup
to before early_initcall() and it worked well:

https://lore.kernel.org/lkml/20210218083636.GA2030@xxxxxxxxx/T/

2) After that fix you reported another issue. If the kernel is run
with "threadirqs=1" - it did not boot also. Because ksoftirqd does
not exist by that time, thus our early-rcu-self test did not pass.

3) Due to (2), Masami Hiramatsu proposed to fix kprobes by delaying
kprobe optimization and it also addressed initial issue:

https://lore.kernel.org/lkml/20210219112357.GA34462@xxxxxxxxx/T/

At the same time Paul made another patch:

softirq: Don't try waking ksoftirqd before it has been spawned

it allows us to keep RCU-tasks initialization before even
early_initcall() where it is now and let our rcu-self-test
to be completed without any hanging.

--
Vlad Rezki