Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

From: Paul E. McKenney
Date: Tue Jan 24 2023 - 17:15:34 EST


On Tue, Jan 24, 2023 at 08:30:08PM +0100, Jonas Oberhauser wrote:
> On 1/24/2023 6:26 PM, Paul E. McKenney wrote:
> > On Tue, Jan 24, 2023 at 05:39:53PM +0100, Jonas Oberhauser wrote:
> > >
> > > On 1/24/2023 5:22 PM, Paul E. McKenney wrote:
> > > > I clearly recall some
> > > > store-based lack of ordering after a grace period from some years back,
> > > > and am thus far failing to reproduce it.
> > > >
> > > > And here is another attempt that herd7 actually does allow.
> > > >
> > > > So what did I mess up this time? ;-)
> > > >
> > > > Thanx, Paul
> > > >
> > > > ------------------------------------------------------------------------
> > > >
> > > > C C-srcu-observed-4
> > > >
> > > > (*
> > > > * Result: Sometimes
> > > > *
> > > > * The Linux-kernel implementation is suspected to forbid this.
> > > > *)
> > > >
> > > > {}
> > > >
> > > > P0(int *x, int *y, int *z, struct srcu_struct *s)
> > > > {
> > > > int r1;
> > > >
> > > > r1 = srcu_read_lock(s);
> > > > WRITE_ONCE(*y, 2);
> > > > WRITE_ONCE(*x, 1);
> > > > srcu_read_unlock(s, r1);
> > > > }
> > > >
> > > > P1(int *x, int *y, int *z, struct srcu_struct *s)
> > > > {
> > > > int r1;
> > > >
> > > > WRITE_ONCE(*y, 1);
> > > > synchronize_srcu(s);
> > > > WRITE_ONCE(*z, 2);
> > > > }
> > > >
> > > > P2(int *x, int *y, int *z, struct srcu_struct *s)
> > > > {
> > > > WRITE_ONCE(*z, 1);
> > > > smp_store_release(x, 2);
> > > > }
> > > >
> > > > exists (x=1 /\ y=1 /\ z=1)
> > > I think even if you implement the unlock as mb() followed by some store that
> > > is read by the gp between mb()s, this would still be allowed.
> > The implementation of synchronize_srcu() has quite a few smp_mb()
> > invocations.
> >
> > But exactly how are you modeling this? As in what additional accesses
> > and memory barriers are you placing in which locations?
>
> Along these lines:
>
> P0(int *x, int *y, int *z, int *magic_location)
> {
> int r1;
>
>
> WRITE_ONCE(*y, 2);
> WRITE_ONCE(*x, 1);
>
> smp_mb();
> WRITE_ONCE(*magic_location, 1);
>
> }
>
> P1(int *x, int *y, int *z, int *magic_location)
> {
> int r1;
>
> WRITE_ONCE(*y, 1);
>
> smp_mb();
> while (! READ_ONCE(*magic_location))
> ;
> smp_mb();
> WRITE_ONCE(*z, 2);
> }
>
>
> P2(int *x, int *y, int *z, struct srcu_struct *s)
> {
> WRITE_ONCE(*z, 1);
> smp_store_release(x, 2);
> }
>
>
>
> Note that you can add as many additional smp_mb() and other accesses as you
> want around the original srcu call sites. I don't see how they could
> influence the absence of a cycle.
>
> (Also, to make it work with herd it seems you need to replace the loop with
> a single read and state in the exists clause that it happens to read a 1.)

I agree that LKMM would allow such a litmus test.

> > > I have already forgotten the specifics, but I think the power model allows
> > > certain stores never propagating somewhere?
> > PowerPC would forbid the 3.2W case, where each process used an
> > smp_store_release() as its sole ordering (no smp_mb() calls at all).
> >
> > [...]
> >
> > This propagation is modulated by the memory barriers, though.
>
> Ah, looking at the model now. Indeed it's forbidden, because in order to say
> that something is in co, there must not be a (resulting) cycle of co and
> barriers. But you'd get that here.  In the axiomatic model, this corresponds
> to saying Power's "prop | co" is acyclic. The same isn't true in LKMM. So
> that's probably why.

Which means that the RCU and SRCU implementations need to make (admittedly
small) guarantees that cannot be expressed in LKMM. Which is in fact
what I was remembering, so I feel better now.

Not sure about the rest of you, though. ;-)

Thanx, Paul