Re: [PATCH] QRCU with lockless fastpath

From: Oleg Nesterov
Date: Sun Feb 25 2007 - 15:17:06 EST


On 02/24, Paul E. McKenney wrote:
>
> This is an updated version of Oleg Nesterov's QRCU that avoids the
> earlier lock acquisition on the synchronize_qrcu() fastpath. This passes
> rcutorture on x86 and the weakly ordered POWER. A promela model of the
> code passes as noted before for 2 readers and 3 updaters and for 3 readers
> and 2 updaters. 3 readers and 3 updaters runs every machine that I have
> access to out of memory -- nothing like a little combinatorial explosion!
> However, after some thought, the proof ended up being simple enough:
>
> 1. If synchronize_qrcu() exits too soon, then by definition
> there has been a reader present during synchronize_srcu()'s
> full execution.
>
> 2. The counter corresponding to this reader will be at least
> 1 at all times.
>
> 3. The synchronize_qrcu() code forces at least one of the counters
> to be at least one at all times -- if there is a reader, the
> sum will be at least two. (Unfortunately, we cannot fetch
> the pair of counters atomically.)
>
> 4. Therefore, the only way that synchronize_qrcu()s fastpath can
> see a sum of 1 is if it races with another synchronize_qrcu() --
> the first synchronize_qrcu() must read one of the counters before
> the second synchronize_qrcu() increments it, and must read the
> other counter after the second synchronize_qrcu() decrements it.
> There can be at most one reader present through this entire
> operation -- otherwise, the first synchronize_qrcu() will see
> a sum of 2 or greater.
>
> 5. But the second synchronize_qrcu() will not release the mutex
> until after the reader is done. During this time, the first
> synchronize_qrcu() will always see a sum of at least 2, and
> therefore cannot take the remainder of the fastpath until the
> reader is done.
>
> 6. Because the second synchronize_qrcu() holds the mutex, no other
> synchronize_qrcu() can manipulate the counters until the reader
> is done. A repeat of the race called out in #4 above therefore
> cannot happen until after the reader is done, in which case it
> is safe for the first synchronize_qrcu() to proceed.
>
> Therefore, two summations of the counter separated by a memory barrier
> suffices and the implementation shown below also suffices.
>
> (And, yes, the fastpath -could- check for a sum of zero and exit
> immediately, but this would help only in case of a three-way race
> between two synchronize_qrcu()s and a qrcu_read_unlock(), would add
> another compare, so is not worth it.)
>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>

Thanks! This fastpath really improves QRCU.

Oleg.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/