Re: [PATCH 2/2] rcu-tasks: add RCU-tasks self tests

From: Masami Hiramatsu
Date: Wed Feb 17 2021 - 09:48:51 EST


On Tue, 16 Feb 2021 09:30:03 -0800
"Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote:

> On Mon, Feb 15, 2021 at 12:28:26PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2021-02-13 08:45:54 [-0800], Paul E. McKenney wrote:
> > > Glad you like it! But let's see which (if any) of these patches solves
> > > the problem for Sebastian.
> >
> > Looking at that, is there any reason for doing this that can not be
> > solved by moving the self-test a little later? Maybe once we reached at
> > least SYSTEM_SCHEDULING?
>
> One problem is that ksoftirqd and the kprobes use are early_initcall(),
> so we cannot count on ksoftirqd being spawned when kprobes first uses
> synchronize_rcu_tasks(). Moving the selftest later won't fix this
> problem, but rather just paper it over.
>
> > This happens now even before lockdep is up or the console is registered.
> > So if something bad happens, you end up with a blank terminal.
>
> I was getting a splat, but I could easily believe that there are
> configurations where the hang is totally silent. In other words, I do
> agree that this needs a proper fix. All we need do is work out an
> agreeable value of "proper". ;-)
>
> > There is nothing else that early in the boot process that requires
> > working softirq. The only exception to this is wait_task_inactive()
> > which is used while starting a new thread (including the ksoftirqd)
> > which is why it was moved to schedule_hrtimeout().
>
> Moving kprobes initialization to early_initcall() [1] means that there
> can be a call to synchronize_rcu_tasks() before the current spawning of
> ksoftirqd. Because synchronize_rcu_tasks() needs timers to work, it needs
> softirq to work. I know two straightforward ways to make that happen:
>
> 1. Spawn ksoftirqd earlier.
>
> 2. Suppress attempts to awaken ksoftirqd before it exists,
> forcing all ksoftirq execution on the back of interrupts.
>
> Uladzislau and I each produced patches for #1, and I produced a patch
> for #2.
>
> The only other option I know of is to push the call to init_kprobes()
> later in the boot sequence, perhaps to its original subsys_initcall(),
> or maybe only as late as core_initcall(). I added Masami and Steve on
> CC for their thoughts on this.
>
> Is there some other proper fix that I am missing?

Oh, I missed that the synchronize_rcu_tasks() will be involved the kprobes
in early stage. Does the problem only exist in the synchronize_rcu_tasks()
instead of synchronize_rcu()? If so I can just stop optimizer in early stage
because I just want to enable kprobes in early stage, but not optprobes.

Does the following patch help?