Re: [PATCH tip/core/rcu 1/3] rcu: Provide polling interfaces for Tree RCU grace periods

From: Frederic Weisbecker
Date: Fri Mar 12 2021 - 07:22:22 EST


On Wed, Mar 03, 2021 at 04:26:30PM -0800, paulmck@xxxxxxxxxx wrote:
> From: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
>
> There is a need for a non-blocking polling interface for RCU grace
> periods, so this commit supplies start_poll_synchronize_rcu() and
> poll_state_synchronize_rcu() for this purpose. Note that the existing
> get_state_synchronize_rcu() may be used if future grace periods are
> inevitable (perhaps due to a later call_rcu() invocation). The new
> start_poll_synchronize_rcu() is to be used if future grace periods
> might not otherwise happen.

By future grace period, you mean if a grace period has been started right
_before_ we start polling, right?


> Finally, poll_state_synchronize_rcu()
> provides a lockless check for a grace period having elapsed since
> the corresponding call to either of the get_state_synchronize_rcu()
> or start_poll_synchronize_rcu().
>
> As with get_state_synchronize_rcu(), the return value from either
> get_state_synchronize_rcu() or start_poll_synchronize_rcu() is passed in
> to a later call to either poll_state_synchronize_rcu() or the existing
> (might_sleep) cond_synchronize_rcu().
>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
[...]
> /**
> + * start_poll_state_synchronize_rcu - Snapshot and start RCU grace period
> + *
> + * Returns a cookie that is used by a later call to cond_synchronize_rcu()
> + * or poll_state_synchronize_rcu() to determine whether or not a full
> + * grace period has elapsed in the meantime. If the needed grace period
> + * is not already slated to start, notifies RCU core of the need for that
> + * grace period.
> + *
> + * Interrupts must be enabled for the case where it is necessary to awaken
> + * the grace-period kthread.
> + */
> +unsigned long start_poll_synchronize_rcu(void)
> +{
> + unsigned long flags;
> + unsigned long gp_seq = get_state_synchronize_rcu();
> + bool needwake;
> + struct rcu_data *rdp;
> + struct rcu_node *rnp;
> +
> + lockdep_assert_irqs_enabled();
> + local_irq_save(flags);
> + rdp = this_cpu_ptr(&rcu_data);
> + rnp = rdp->mynode;
> + raw_spin_lock_rcu_node(rnp); // irqs already disabled.
> + needwake = rcu_start_this_gp(rnp, rdp, gp_seq);

I'm a bit surprised we don't start a new grace period instead of snapshotting
the current one.

So if we do this:

//start grace period gp_num=5

old = p;
rcu_assign_pointer(p, new);

num = start_poll_synchronize_rcu(); // num = 5

//grace period ends, start new gp_num=6

poll_state_synchronize_rcu(num); // rcu seq is done

kfree(old);

Isn't there a risk that other CPUs still see the old pointer?

Of course I know I'm missing something obvious :-)

Thanks.