Re: [PATCH locking/Documentation 1/2] Add note of release-acquire store vulnerability

From: Paul E. McKenney
Date: Fri Sep 30 2016 - 09:35:38 EST


On Fri, Sep 30, 2016 at 02:51:13PM +0200, Peter Zijlstra wrote:
> On Fri, Sep 30, 2016 at 05:14:03AM -0700, Paul E. McKenney wrote:
> > PowerPC does not "obscure" stores, so both stores really are there and
> > the lwsync really has effect on all CPUs. From what I understand, even
> > CPUs that do obscure stores only do so in the case of repeated stores
> > by the same CPU to the same variable, and the above litmus test doesn't
> > have this.
> >
> > So all the stores happen, and each CPU's stores are at least locally
> > ordered.
>
> OK, when I'm not sure I ever understood the case where smp_wmb() went
> wonky on PPC, sadly I cannot now find the email where you mentioned
> that :/

First, a better explanation of your example:

PPC PeterZijlstra+o-r+o-r+a-o-SB.litmus
{
0:r1=1; 0:r2=2; 0:r3=x; 0:r4=y;
1:r1=1; 1:r2=2; 1:r3=x; 1:r4=y;
2:r3=x; 2:r4=y;
}
P0 | P1 | P2 ;
stw r1,0(r3) | stw r2,0(r3) | lwz r1,0(r4) ;
lwsync | lwsync | lwsync ;
stw r1,0(r4) | stw r2,0(r4) | lwz r2,0(r3) ;
exists
(x=2 /\ y=1 /\ 2:r1=1 /\ 2:r2=1)

Given that 2:r1=1, and ignoring P1 for the moment, we have simple message
passing. If P2 sees P0's store to y, it must also see P0's store to x.

So what happens when we include P1? Well, we have constrained the test
to the case where P2 sees P0's store to y, so P2's load from x must
still see P0's store to x, or some later store to x. Either way, given
that P2 sees P0's store to y, it cannot see the initial value of x.
In other words, even if P0's store to x is overwritten, it still has
effect on the ordering.

There are several changes to the litmus test that could require ordering
that lwsync does not provide, which I suppose could be considered to
introduce wonkiness. ;-)

One is the infamous "Z6.3" litmus test that you called out in your
earlier email. At least one of the pairs of stores must be separated
by sync rather than lwsync. Z6.3's third variable defeats lwsync's
local ordering.

Thanx, Paul