Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

From: Peter Zijlstra
Date: Fri Oct 09 2015 - 04:31:51 EST


On Thu, Oct 08, 2015 at 02:44:39PM -0700, Paul E. McKenney wrote:
> On Thu, Oct 08, 2015 at 01:16:38PM +0200, Peter Zijlstra wrote:
> > On Thu, Oct 08, 2015 at 02:50:36PM +1100, Michael Ellerman wrote:
> > > On Wed, 2015-10-07 at 08:25 -0700, Paul E. McKenney wrote:
> >
> > > > Currently, we do need smp_mb__after_unlock_lock() to be after the
> > > > acquisition on PPC -- putting it between the unlock and the lock
> > > > of course doesn't cut it for the cross-thread unlock/lock case.
> >
> > This ^, that makes me think I don't understand
> > smp_mb__after_unlock_lock.
> >
> > How is:
> >
> > UNLOCK x
> > smp_mb__after_unlock_lock()
> > LOCK y
> >
> > a problem? That's still a full barrier.
>
> The problem is that I need smp_mb__after_unlock_lock() to give me
> transitivity even if the UNLOCK happened on one CPU and the LOCK
> on another. For that to work, the smp_mb__after_unlock_lock() needs
> to be either immediately after the acquire (the current choice) or
> immediately before the release (which would also work from a purely
> technical viewpoint, but I much prefer the current choice).
>
> Or am I missing your point?

So lots of little confusions added up to complete fail :-{

Mostly I think it was the UNLOCK x + LOCK x are fully ordered (where I
forgot: but not against uninvolved CPUs) and RELEASE/ACQUIRE are
transitive (where I forgot: RELEASE/ACQUIRE _chains_ are transitive, but
again not against uninvolved CPUs).

Which leads me to think I would like to suggest alternative rules for
RELEASE/ACQUIRE (to replace those Will suggested; as I think those are
partly responsible for my confusion).

- RELEASE -> ACQUIRE is fully ordered (but not a full barrier) when
they operate on the same variable and the ACQUIRE reads from the
RELEASE. Notable, RELEASE/ACQUIRE are RCpc and lack transitivity.

- RELEASE -> ACQUIRE can be upgraded to a full barrier (including
transitivity) using smp_mb__release_acquire(), either before RELEASE
or after ACQUIRE (but consistently [*]).

- RELEASE -> ACQUIRE _chains_ (on shared variables) preserve causality,
(because each link is fully ordered) but are not transitive.

And I think that in the past few weeks we've been using transitive
ambiguously, the definition we have in Documentation/memory-barriers.txt
is a _strong_ transitivity, where we can make guarantees about CPUs not
directly involved.

What we have here (due to RCpc) is a weak form of transitivity, which,
while it preserves the natural concept of causality, does not extend to
other CPUs.

So we could go around and call them 'strong' and 'weak' transitivity,
but I suspect its easier for everyone involved if we come up with
separate terms (less room for error if we accidentally omit the
'strong/weak' qualifier).


[*] Do we want to take that choice away and go for:
smp_mb__after_release_acquire() ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/