Re: [PATCH v2] rcu: Fix the start_poll_synchronize_rcu_expedited() be invoked very early

From: Paul E. McKenney
Date: Thu Jan 12 2023 - 14:19:17 EST


On Thu, Jan 12, 2023 at 03:56:29PM +0800, Zqiang wrote:
> Currently, the start_poll_synchronize_rcu_expedited() can be invoked
> very early. before rcu_init(), the rcu_data structure's->mynode is not
> initialized, if invoke start_poll_synchronize_rcu_expedited() before
> rcu_init(), will trigger a null rcu_node structure's->exp_seq_poll access.
>
> This commit add boot_exp_seq_poll_rq member to rcu_state structure to
> store seq number return by invoke start_poll_synchronize_rcu_expedited()
> very early.
>
> Fixes: d96c52fe4907 ("rcu: Add polled expedited grace-period primitives")
> Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>

First off, excellent catch, Zqiang!!!

And thank you for Frederic and Joel for your reviews.

But I believe that this can be simplified, for example, as shown in
the (untested) patch below.

Thoughts?

And yes, I did presumptuously add Frederic's and Joel's reviews.
Please let me know if you disagree, and if so what different approach
you would prefer. (Though of course simple disagreement is sufficient
for me to remove your tag. Not holding you hostage for improvements,
not yet, anyway!)

Thanx, Paul

------------------------------------------------------------------------

commit e05af5cb3858e669c9e6b70e0aca708cc70457da
Author: Zqiang <qiang1.zhang@xxxxxxxxx>
Date: Thu Jan 12 10:48:29 2023 -0800

rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked early

According to the commit log of the patch that added it to the kernel,
start_poll_synchronize_rcu_expedited() can be invoked very early, as
in long before rcu_init() has been invoked. But before rcu_init(),
the rcu_data structure's ->mynode field has not yet been initialized.
This means that the start_poll_synchronize_rcu_expedited() function's
attempt to set the CPU's leaf rcu_node structure's ->exp_seq_poll_rq
field will result in a segmentation fault.

This commit therefore causes start_poll_synchronize_rcu_expedited() to
set ->exp_seq_poll_rq only after rcu_init() has initialized all CPUs'
rcu_data structures' ->mynode fields. It also removes the check from
the rcu_init() function so that start_poll_synchronize_rcu_expedited(
is unconditionally invoked. Yes, this might result in an unnecessary
boot-time grace period, but this is down in the noise. Besides, there
only has to be one call_rcu() invoked prior to scheduler initialization
to make this boot-time grace period necessary.

Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>
Reviewed-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
Reviewed-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 63545d79da51c..f2e3a23778c06 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4937,9 +4937,8 @@ void __init rcu_init(void)
else
qovld_calc = qovld;

- // Kick-start any polled grace periods that started early.
- if (!(per_cpu_ptr(&rcu_data, cpu)->mynode->exp_seq_poll_rq & 0x1))
- (void)start_poll_synchronize_rcu_expedited();
+ // Kick-start in case any polled grace periods started early.
+ (void)start_poll_synchronize_rcu_expedited();

rcu_test_sync_prims();
}
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 956cd459ba7f3..3b7abb58157df 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -1068,9 +1068,10 @@ unsigned long start_poll_synchronize_rcu_expedited(void)
if (rcu_init_invoked())
raw_spin_lock_irqsave(&rnp->exp_poll_lock, flags);
if (!poll_state_synchronize_rcu(s)) {
- rnp->exp_seq_poll_rq = s;
- if (rcu_init_invoked())
+ if (rcu_init_invoked()) {
+ rnp->exp_seq_poll_rq = s;
queue_work(rcu_gp_wq, &rnp->exp_poll_wq);
+ }
}
if (rcu_init_invoked())
raw_spin_unlock_irqrestore(&rnp->exp_poll_lock, flags);