Re: [PATCH 1/2] rcu/tree: Add a warning if CPU being onlined did not report QS already

From: Paul E. McKenney
Date: Fri Oct 02 2020 - 00:40:14 EST


On Tue, Sep 29, 2020 at 03:29:27PM -0400, Joel Fernandes (Google) wrote:
> Currently, rcu_cpu_starting() checks to see if the RCU core expects a
> quiescent state from the incoming CPU. However, the current interaction
> between RCU quiescent-state reporting and CPU-hotplug operations should
> mean that the incoming CPU never needs to report a quiescent state.
> First, the outgoing CPU reports a quiescent state if needed. Second,
> the race where the CPU is leaving just as RCU is initializing a new
> grace period is handled by an explicit check for this condition. Third,
> the CPU's leaf rcu_node structure's ->lock serializes these checks.
>
> This means that if rcu_cpu_starting() ever feels the need to report
> a quiescent state, then there is a bug somewhere in the CPU hotplug
> code or the RCU grace-period handling code. This commit therefore
> adds a WARN_ON_ONCE() to bring that bug to everyone's attention.
>
> Cc: Paul E. McKenney <paulmck@xxxxxxxxxx>
> Cc: Neeraj Upadhyay <neeraju@xxxxxxxxxxxxxx>
> Suggested-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>

Queued for testing and further review, thank you!

Thanx, Paul

> ---
> kernel/rcu/tree.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 55d3700dd1e7..5efe0a98ea45 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4119,7 +4119,9 @@ void rcu_cpu_starting(unsigned int cpu)
> rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */
> rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq);
> rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags);
> - if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */
> +
> + /* An incoming CPU should never be blocking a grace period. */
> + if (WARN_ON_ONCE(rnp->qsmask & mask)) { /* RCU waiting on incoming CPU? */
> rcu_disable_urgency_upon_qs(rdp);
> /* Report QS -after- changing ->qsmaskinitnext! */
> rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
> --
> 2.28.0.709.gb0816b6eb0-goog
>