Re: XDP socket rings, and LKMM litmus tests

From: Boqun Feng
Date: Thu Mar 04 2021 - 20:13:41 EST


On Thu, Mar 04, 2021 at 11:11:42AM -0500, Alan Stern wrote:
> On Thu, Mar 04, 2021 at 02:33:32PM +0800, Boqun Feng wrote:
>
> > Right, I was thinking about something unrelated.. but how about the
> > following case:
> >
> > local_v = &y;
> > r1 = READ_ONCE(*x); // f
> >
> > if (r1 == 1) {
> > local_v = &y; // e
> > } else {
> > local_v = &z; // d
> > }
> >
> > p = READ_ONCE(local_v); // g
> >
> > r2 = READ_ONCE(*p); // h
> >
> > if r1 == 1, we definitely think we have:
> >
> > f ->ctrl e ->rfi g ->addr h
> >
> > , and if we treat ctrl;rfi as "to-r", then we have "f" happens before
> > "h". However compile can optimze the above as:
> >
> > local_v = &y;
> >
> > r1 = READ_ONCE(*x); // f
> >
> > if (r1 != 1) {
> > local_v = &z; // d
> > }
> >
> > p = READ_ONCE(local_v); // g
> >
> > r2 = READ_ONCE(*p); // h
> >
> > , and when this gets executed, I don't think we have the guarantee we
> > have "f" happens before "h", because CPU can do optimistic read for "g"
> > and "h".
>
> In your example, which accesses are supposed to be to actual memory and
> which to registers? Also, remember that the memory model assumes the

Given that we use READ_ONCE() on local_v, local_v should be a memory
location but only accessed by this thread.

> hardware does not reorder loads if there is an address dependency
> between them.
>

Right, so "g" won't be reordered after "h".

> > Part of this is because when we take plain access into consideration, we
> > won't guarantee a read-from or other relations exists if compiler
> > optimization happens.
> >
> > Maybe I'm missing something subtle, but just try to think through the
> > effect of making dep; rfi as "to-r".
>
> Forget about local variables for the time being and just consider
>
> dep ; [Plain] ; rfi
>
> For example:
>
> A: r1 = READ_ONCE(x);
> y = r1;
> B: r2 = READ_ONCE(y);
>
> Should B be ordered after A? I don't see how any CPU could hope to
> excute B before A, but maybe I'm missing something.
>

Agreed.

> There's another twist, connected with the fact that herd7 can't detect
> control dependencies caused by unexecuted code. If we have:
>
> A: r1 = READ_ONCE(x);
> if (r1)
> WRITE_ONCE(y, 5);
> r2 = READ_ONCE(y);
> B: WRITE_ONCE(z, r2);
>
> then in executions where x == 0, herd7 doesn't see any control
> dependency. But CPUs do see control dependencies whenever there is a
> conditional branch, whether the branch is taken or not, and so they will
> never reorder B before A.
>

Right, because B in this example is a write, what if B is a read that
depends on r2, like in my example? Let y be a pointer to a memory
location, and initialized as a valid value (pointing to a valid memory
location) you example changed to:

A: r1 = READ_ONCE(x);
if (r1)
WRITE_ONCE(y, 5);
C: r2 = READ_ONCE(y);
B: r3 = READ_ONCE(*r2);

, then A don't have the control dependency to B, because A and B is
read+read. So B can be ordered before A, right?

> One last thing to think about: My original assessment or Björn's problem
> wasn't right, because the dep in (dep ; rfi) doesn't include control
> dependencies. Only data and address. So I believe that the LKMM

Ah, right. I was mising that part (ctrl is not in dep). So I guess my
example is pointless for the question we are discussing here ;-(

> wouldn't consider A to be ordered before B in this example even if x
> was nonzero.

Yes, and similar to my example (changing B to a read).

I did try to run my example with herd, and got confused no matter I make
dep; [Plain]; rfi as to-r (I got the same result telling me a reorder
can happen). Now the reason is clear, because this is a ctrl; rfi not a
dep; rfi.

Thanks so much for walking with me on this ;-)

Regards,
Boqun

>
> Alan