Re: [PATCH] a local-timer-free version of RCU

From: Paul E. McKenney
Date: Mon Nov 15 2010 - 20:28:56 EST


On Sat, Nov 13, 2010 at 11:30:49PM +0100, Frederic Weisbecker wrote:
> On Wed, Nov 10, 2010 at 08:19:20PM -0800, Paul E. McKenney wrote:
> > On Wed, Nov 10, 2010 at 06:31:11PM +0100, Peter Zijlstra wrote:
> > > On Wed, 2010-11-10 at 16:54 +0100, Frederic Weisbecker wrote:
> > > > run the sched tick and if there was nothing to do
> > > > for some time and we are in userspace, deactivate it.
> > >
> > > Not for some time, immediately, have the tick track if it was useful, if
> > > it was not, have it stop itself, like:
> > >
> > > tick()
> > > {
> > > int stop = 1;
> > >
> > > if (nr_running > 1)
> > > stop = 0;
> > >
> > > if(rcu_needs_cpu())
> > > stop = 0;
> > >
> > > ...
> > >
> > >
> > > if (stop)
> > > enter_nohz_mode();
> > > }
> >
> > I am still holding out for a dyntick-hpc version of RCU that does not
> > need the tick. ;-)
>
>
> So you don't think it would be an appropriate solution? Keeping the tick for short
> periods of time while we need it only, that looks quite a good way to try.

My concern is not the tick -- it is really easy to work around lack of a
tick from an RCU viewpoint. In fact, this happens automatically given the
current implementations! If there is a callback anywhere in the system,
then RCU will prevent the corresponding CPU from entering dyntick-idle
mode, and that CPU's clock will drive the rest of RCU as needed via
force_quiescent_state(). The force_quiescent_state() workings would
want to be slightly different for dyntick-hpc, but not significantly so
(especially once I get TREE_RCU moved to kthreads).

My concern is rather all the implicit RCU-sched read-side critical
sections, particularly those that arch-specific code is creating.
And it recently occurred to me that there are necessarily more implicit
irq/preempt disables than there are exception entries.

So would you be OK with telling RCU about kernel entries/exits, but
simply not enabling the tick? The irq and NMI kernel entries/exits are
already covered, of course.

This seems to me to work out as follows:

1. If there are no RCU callbacks anywhere in the system, RCU
is quiescent and does not cause any IPIs or interrupts of
any kind. For HPC workloads, this should be the common case.

2. If there is an RCU callback, then one CPU keeps a tick going
and drives RCU core processing on all CPUs. (This probably
works with RCU as is, but somewhat painfully.) This results
in some IPIs, but only to those CPUs that remain running in
the kernel for extended time periods. Appropriate adjustment
of RCU_JIFFIES_TILL_FORCE_QS, possibly promoted to be a
kernel configuration parameter, should make such IPIs
-extremely- rare. After all, how many kernel code paths
are going to consume (say) 10 jiffies of CPU time? (Keep
in mind that if the system call blocks, the CPU will enter
dyntick-idle mode, and RCU will still recognize it as an
innocent bystander without needing to IPI it.)

3. The implicit RCU-sched read-side critical sections just work
as they do today.

Or am I missing some other problems with this approach?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/