Re: "Verifying and Optimizing Compact NUMA-Aware Locks on Weak Memory Models"

From: Alan Stern
Date: Sat Sep 10 2022 - 11:03:39 EST


On Sat, Sep 10, 2022 at 12:11:36PM +0000, Hernan Luis Ponce de Leon wrote:
>
> What they mean seems to be that a prop relation followed only by wmb
> (not mb) doesn't enforce the order of some writes to the same
> location, leading to the claimed hang in qspinlock (at least as far as
> LKMM is concerned).

You were quoting Jonas here, right? The email doesn't make this obvious
because it doesn't have two levels of "> > " markings.

> What we mean is that wmb does not give the same propagation properties as mb.

In general, _no_ two distinct relations in the LKMM have the same
propagation properties. If wmb always behaved the same way as mb, we
wouldn't use two separate words for them.

> The claim is based on these relations from the memory model
>
> let strong-fence = mb | gp
> ...
> let cumul-fence = [Marked] ; (A-cumul(strong-fence | po-rel) | wmb |
> po-unlock-lock-po) ; [Marked]
> let prop = [Marked] ; (overwrite & ext)? ; cumul-fence* ;
> [Marked] ; rfe? ; [Marked]

Please be more specific. What difference between mb and wmb are you
concerned about? Can you give a small litmus test that illustrates this
difference? Can you explain in more detail how this difference affects
the qspinlock implementation?

> From an engineering perspective, I think the only issue is that cat
> *currently* does not have any syntax for this,

Syntax for what? The difference between wmb and mb?

> nor does herd currently
> implement the await model checking techniques proposed in those works
> (c.f. Theorem 5.3. in the "making weak memory models fair" paper,
> which says that for this kind of loop, iff the mo-maximal reads in
> some graph are read in a loop iteration that does not exit the loop,
> the loop can run forever). However GenMC and I believe also Dat3M and
> recently also Nidhugg support such techniques. It may not even be too
> much effort to implement something like this in herd if desired.

I believe that herd has no way to express the idea of a program running
forever. On the other hand, it's certainly true (in all of these
models) than for any finite number N, there is a feasible execution in
which a loop runs for more than N iterations before the termination
condition eventually becomes true.

Alan

> The Dartagnan model checker uses the Theorem 5.3 from above to detect
> liveness violations.
>
> We did not try to come up with a litmus test about the behavior
> because herd7 cannot reason about liveness.
> However, if anybody is interested, the violating execution is shown here
> https://github.com/huawei-drc/cna-verification/blob/master/verification-output/BUG1.png
>
> Hernan