Re: [PATCH] a local-timer-free version of RCU

From: Paul E. McKenney
Date: Sun Nov 07 2010 - 21:54:26 EST


On Mon, Nov 08, 2010 at 03:19:36AM +0100, Udo A. Steinberg wrote:
> On Mon, 8 Nov 2010 03:11:36 +0100 Udo A. Steinberg (UAS) wrote:
>
> UAS> On Sat, 6 Nov 2010 12:28:12 -0700 Paul E. McKenney (PEM) wrote:
> UAS>
> UAS> PEM> > + * rcu_quiescent() is called from rcu_read_unlock() when a
> UAS> PEM> > + * RCU batch was started while the rcu_read_lock/rcu_read_unlock
> UAS> PEM> > + * critical section was executing.
> UAS> PEM> > + */
> UAS> PEM> > +
> UAS> PEM> > +void rcu_quiescent(int cpu)
> UAS> PEM> > +{
> UAS> PEM>
> UAS> PEM> What prevents two different CPUs from calling this concurrently?
> UAS> PEM> Ah, apparently nothing -- the idea being that
> UAS> PEM> rcu_grace_period_complete() sorts it out. Though if the second
> UAS> PEM> CPU was delayed, it seems like it might incorrectly end a
> UAS> PEM> subsequent grace period as follows:
> UAS> PEM>
> UAS> PEM> o CPU 0 clears the second-to-last bit.
> UAS> PEM>
> UAS> PEM> o CPU 1 clears the last bit.
> UAS> PEM>
> UAS> PEM> o CPU 1 sees that the mask is empty, so invokes
> UAS> PEM> rcu_grace_period_complete(), but is delayed in the function
> UAS> PEM> preamble.
> UAS> PEM>
> UAS> PEM> o CPU 0 sees that the mask is empty, so invokes
> UAS> PEM> rcu_grace_period_complete(), ending the grace period.
> UAS> PEM> Because the RCU_NEXT_PENDING is set, it also starts
> UAS> PEM> a new grace period.
> UAS> PEM>
> UAS> PEM> o CPU 1 continues in rcu_grace_period_complete(),
> UAS> PEM> incorrectly ending the new grace period.
> UAS> PEM>
> UAS> PEM> Or am I missing something here?
> UAS>
> UAS> The scenario you describe seems possible. However, it should be easily
> UAS> fixed by passing the perceived batch number as another parameter to
> UAS> rcu_set_state() and making it part of the cmpxchg. So if the caller
> UAS> tries to set state bits on a stale batch number (e.g., batch !=
> UAS> rcu_batch), it can be detected.
> UAS>
> UAS> There is a similar, although harmless, issue in call_rcu(): Two CPUs can
> UAS> concurrently add callbacks to their respective nxt list and compute the
> UAS> same value for nxtbatch. One CPU succeeds in setting the PENDING bit
> UAS> while observing COMPLETE to be clear, so it starts a new batch.
>
> Correction: while observing COMPLETE to be set!
>
> UAS> Afterwards, the other CPU also sets the PENDING bit, but this time for
> UAS> the next batch. So it ends up requesting nxtbatch+1, although there is
> UAS> no need to. This also would be fixed by making the batch number part of
> UAS> the cmpxchg.

Another approach is to map the underlying algorithm onto the TREE_RCU
data structures. And make preempt_disable(), local_irq_save(), and
friends invoke rcu_read_lock() -- irq and nmi handlers already have
the dyntick calls into RCU, so should be easy to handle as well.
Famous last words. ;-)

Thanx, Paul

> Cheers,
>
> - Udo


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/