RE: [PATCH] rcu: Add rnp->cbovldmask check in rcutree_migrate_callbacks()

From: Zhang, Qiang1
Date: Thu May 05 2022 - 20:41:03 EST


On Thu, May 05, 2022 at 11:52:36PM +0800, Zqiang wrote:
> Currently, the rnp's cbovlmask is set in call_rcu(). when CPU going
> offline, the outgoing CPU's callbacks is migrated to target CPU, the
> number of callbacks on the my_rdp may be overloaded, if overload and
> there is no call_rcu() call on target CPU for a long time, the rnp's
> cbovldmask is not set in time. in order to fix this situation, add
> check_cb_ovld_locked() in rcutree_migrate_callbacks() to help CPU more
> quickly reach quiescent states.
>
> Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>

>Doesn't this get set right at the end of the current grace period?
>Given that there is a callback overload, there should be a grace period in progress.
>
>See this code in rcu_gp_cleanup():
>
> if (rcu_is_leaf_node(rnp))
> for_each_leaf_node_cpu_mask(rnp, cpu, rnp->cbovldmask) {
> rdp = per_cpu_ptr(&rcu_data, cpu);
> check_cb_ovld_locked(rdp, rnp);
> }
>
>So what am I missing here? Or are you planning to remove the above code?

We only checked the overloaded rdp at the end of current grace period, for
my_rdp overloaded cause by migration callbacks to it, if the my_rdp overloaded,
and the my_rdp->mynode 's cbovldmask is empty, the my_rdp overloaded may be
not checked at end of the current grace period.

I have another question, why don't we call check_cb_ovld_locked() when rdp's n_cbs decreases.
for example call check_cb_ovld_locked() in rcu_do_bacth(), not at the end of grace period.

>
>If so, wouldn't you also need to clear the indication for the CPU that is going offline, being careful to handle the case where the two CPUs have different leaf rcu_node structures?

Yes the offline CPU need to clear.

Thanks,
Zqiang

>
> Thanx, Paul
>
> ---
> kernel/rcu/tree.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index
> 9dc4c4e82db6..bcc5876c9753 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4577,6 +4577,7 @@ void rcutree_migrate_callbacks(int cpu)
> needwake = needwake || rcu_advance_cbs(my_rnp, my_rdp);
> rcu_segcblist_disable(&rdp->cblist);
> WARN_ON_ONCE(rcu_segcblist_empty(&my_rdp->cblist) !=
> !rcu_segcblist_n_cbs(&my_rdp->cblist));
> + check_cb_ovld_locked(my_rdp, my_rnp);
> if (rcu_rdp_is_offloaded(my_rdp)) {
> raw_spin_unlock_rcu_node(my_rnp); /* irqs remain disabled. */
> __call_rcu_nocb_wake(my_rdp, true, flags);
> --
> 2.25.1
>