Re: [PATCH v2 2/2] tools/memory-model: Make ppo a subrelation of po

From: Paul E. McKenney
Date: Mon Jan 30 2023 - 11:50:46 EST


On Mon, Jan 30, 2023 at 11:47:50AM -0500, Alan Stern wrote:
> On Sun, Jan 29, 2023 at 08:36:45PM -0800, Paul E. McKenney wrote:
> > On Sun, Jan 29, 2023 at 09:39:17PM -0500, Alan Stern wrote:
> > > On Sun, Jan 29, 2023 at 11:19:32PM +0100, Jonas Oberhauser wrote:
> > > > I see now. Somehow I thought stores must execute in program order, but I
> > > > guess it doesn't make sense.
> > > > In that sense, W ->xbstar&int X always means W propagates to X's CPU before
> > > > X executes.
> > >
> > > It also means any write that propagates to W's CPU before W executes
> > > also propagates to X's CPU before X executes (because it's the same CPU
> > > and W executes before X).
> > >
> > > > > Ideally we would fix this by changing the definition of po-rel to:
> > > > >
> > > > > [M] ; (xbstar & int) ; [Release]
> > > > >
> > > > > (This is closely related to the use of (xbstar & int) in the definition
> > > > > of vis that you asked about.)
> > > >
> > > > This misses the property of release stores that any po-earlier store must
> > > > also execute before the release store.
> > >
> > > I should have written:
> > >
> > > [M] ; (po | (xbstar & int)) ; [Release]
> > >
> > > > Perhaps it could be changed to the old  po-rel | [M] ; (xbstar & int) ;
> > > > [Release] but then one could instead move this into the definition of
> > > > cumul-fence.
> > > > In fact you'd probably want this for all the propagation fences, so
> > > > cumul-fence and pb should be the right place.
> > > >
> > > > > Unfortunately we can't do this, because
> > > > > po-rel has to be defined long before xbstar.
> > > >
> > > > You could do it, by turning the relation into one massive recursive
> > > > definition.
> > >
> > > Which would make pretty much the entire memory model one big recursion.
> > > I do not want to do that.
> > >
> > > > Thinking about what the options are:
> > > > 1) accept the difference and run with it by making it consistent inside the
> > > > axiomatic model
> > > > 2) fix it through the recursive definition, which seems to be quite ugly but
> > > > also consistent with the power operational model as far as I can tell
> > > > 3) weaken the operational model... somehow
> > > > 4) just ignore the anomaly
> > > > 5) ???
> > > >
> > > > Currently my least favorite option is 4) since it seems a bit off that the
> > > > reasoning applies in one specific case of LKMM, more specifically the data
> > > > race definition which should be equivalent to "the order of the two races
> > > > isn't fixed", but here the order isn't fixed but it's a data race.
> > > > I think the patch happens to almost do 1) because the xbstar&int at the end
> > > > should already imply ordering through the prop&int <= hb rule.
> > > > What would remain is to also exclude rcu-fence somehow.
> > >
> > > IMO 1) is the best choice.
> > >
> > > Alan
> > >
> > > PS: For the record, here's a simpler litmus test to illustrates the
> > > failing. The idea is that Wz=1 is reordered before the store-release,
> > > so it ought to propagate before Wy=1. The LKMM does not require this.
> >
> > In PowerPC terms, would this be like having the Wz=1 being reorders
> > before the Wy=1, but not before the lwsync instruction preceding the
> > Wy=1 that made it be a release store?
>
> No, it would be like having the Wz=1 reordered before the Rx=1,
> therefore before the lwsync. Obviously this can't ever happen on
> PowerPC.

Whew!!! ;-)

Thanx, Paul

> Alan
>
> > If so, we might have to keep this quirk.
> >
> > Thanx, Paul
> >
> > > C before-release
> > >
> > > {}
> > >
> > > P0(int *x, int *y, int *z)
> > > {
> > > int r1;
> > >
> > > r1 = READ_ONCE(*x);
> > > smp_store_release(y, 1);
> > > WRITE_ONCE(*z, 1);
> > > }
> > >
> > > P1(int *x, int *y, int *z)
> > > {
> > > int r2;
> > >
> > > r2 = READ_ONCE(*z);
> > > WRITE_ONCE(*x, r2);
> > > }
> > >
> > > P2(int *x, int *y, int *z)
> > > {
> > > int r3;
> > > int r4;
> > >
> > > r3 = READ_ONCE(*y);
> > > smp_rmb();
> > > r4 = READ_ONCE(*z);
> > > }
> > >
> > > exists (0:r1=1 /\ 2:r3=1 /\ 2:r4=0)