Re: [PATCH tip/core/rcu 3/7] rcu: Avoid IPIing idle CPUs from synchronize_sched_expedited()

From: Paul E. McKenney
Date: Wed Oct 29 2014 - 11:56:52 EST


On Wed, Oct 29, 2014 at 11:59:54AM +0100, Peter Zijlstra wrote:
> On Tue, Oct 28, 2014 at 03:22:58PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> >
> > Currently, synchronize_sched_expedited() sends IPIs to all online CPUs,
> > even those that are idle or executing in nohz_full= userspace. Because
> > idle CPUs and nohz_full= userspace CPUs are in extended quiescent states,
> > there is no need to IPI them in the first place. This commit therefore
> > avoids IPIing CPUs that are already in extended quiescent states.
> >
> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > ---
> > kernel/rcu/tree.c | 27 ++++++++++++++++++++++++++-
> > 1 file changed, 26 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 7f73c5edf8cf..9e3c20f117cd 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2950,6 +2950,9 @@ static int synchronize_sched_expedited_cpu_stop(void *data)
> > */
> > void synchronize_sched_expedited(void)
> > {
> > + cpumask_var_t cm;
> > + bool cma = false;
> > + int cpu;
> > long firstsnap, s, snap;
> > int trycount = 0;
> > struct rcu_state *rsp = &rcu_sched_state;
> > @@ -2984,11 +2987,26 @@ void synchronize_sched_expedited(void)
> > }
> > WARN_ON_ONCE(cpu_is_offline(raw_smp_processor_id()));
> >
> > + /* Offline CPUs, idle CPUs, and any CPU we run on are quiescent. */
> > + cma = zalloc_cpumask_var(&cm, GFP_KERNEL);
> > + if (cma) {
> > + cpumask_copy(cm, cpu_online_mask);
> > + cpumask_clear_cpu(raw_smp_processor_id(), cm);
> > + for_each_cpu(cpu, cm) {
> > + struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
> > +
> > + if (!(atomic_add_return(0, &rdtp->dynticks) & 0x1))
> > + cpumask_clear_cpu(cpu, cm);
> > + }
> > + if (cpumask_weight(cm) == 0)
> > + goto all_cpus_idle;
> > + }
>
> Is there a reason not to use on_each_cpu_cond()?

Because I don't know how to write a function that returns a blooean value?
(Sorry, couldn't resist, and yes I do know that "boolean" was meant.)
If we had lambdas, I might be interested in making that transformation,
but pulling the condition into a separate function doesn't seem like
a win to me.

But even with lambdas, it looks to me like on_each_cpu_cond() just does
an IPI, and I need the selected CPUs to do a context switch. Yes, I
could make the IPI handler function call induce a context switch, but
then I would have to add more mechanism to wait for the induced context
switches to actually happen.

That said, I am considering switching synchronize_sched_expedited()
from try_stop_cpus() to resched_cpu() if I need to parallelize
synchronize_sched_expedited().

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/