Re: [PATCH] tools/memory-model: Make ppo a subrelation of po

From: Jonas Oberhauser
Date: Fri Jan 20 2023 - 19:41:59 EST




On 1/20/2023 5:32 PM, Alan Stern wrote:
On Fri, Jan 20, 2023 at 12:12:50PM +0100, Jonas Oberhauser wrote:

On 1/19/2023 9:06 PM, Alan Stern wrote:
That's backward logic. Being strong isn't the reason the fences are
A-cumulative. Indeed, the model could have been written not to assume
that strong fences are A-cumulative.
It's not just the model, it's also how strong fences are introduced in the
documentation.
The documentation can be updated.

Sure. But what would be the benefit?
Anyways, what I meant is something else.
There are (at least) two interpretations of what a strong-fence is.
The first that you have in mind as far as I understand, that every store affected by the strong-fence must propagate to every other CPU before any instruction behind the strong-fence executes. (but it doesn't talk about which stores are being affected).
The second one is that a strong-fence is an A-cumulative fence which additionally has that guarantee.

What I meant is that reading the documentation or the model, one might come to the conclusion that it means the second thing, and in that interpretation fences that are strong must be A-cumulative.
I don't see anything wrong with that, especially since I don't think strong-fence is a standard term that exists in a vacuum and means the first thing by convention.

Obviously there's some elegance in making things orthogonal rather than hierarchical.
So I can understand why you defend the first interpretation.
But there's really only a benefit if that opens up some interesting design space. I just don't see that right now.

So if hypothetically you were ok to change the meaning of strong fence to include A-cumulativity -- and I think from the model and documentation perspective it doesn't take much to do that if anything -- then saying "all strong-fence properties are covered exactly in pb" isn't a big step.



I agree though that you could decouple the notion of strong from
A-cumulativity.
But would anyone want a strong fence that is not A-cumulative?
Why would anyone want a weak fence that isn't A-cumulative? :-)
Architecture designers sometimes do strange things...

(as a side note and out of curiosity: Which architecture has a weak fence that isn't A-cumulative? Is it alpha?)

As for strong fences that aren't A-cumulative, I remember someone telling me not too long ago that one shouldn't worry about strange hypothetical architectures ; )
More to the point, I find it improbable that the performance benefit of this vs just using however smp_mb() maps on that architecture would ever warrant the inclusion of such a fence in the LKMM.

It's a bit like asking in C11 for a barrier that has the sequential
consistency guarantee of appearing in the global order S, but doesn't have
release or acquire semantics. Yes you could define that, but would it really
make sense?
You're still missing the point. The important aspect of the fences in
cumul-fence is that they are A-cumulative, not whether they are strong.

I completely understand that. I just don't think there's anything fundamentally wrong with alternatively creating a more disjoint hierarchy of
- fences that aren't A-cumulative but order propagation (in cumul-fence, without A-cumul)
- fences that are A-cumulative but aren't strong (in cumul-fence, with A-cumul)
- fences that are strong (in pb)


Right now, both pb and cumul-fence deal with strong fences. And again, I understand that one point of view here is that this is not because strong fences need to inherently be A-cumulative and included in cumul-fence by using the strong-fence identifier.
Indeed one could have, in theory, strong fences that aren't A-cumulative, and using strong-fence is as you wrote just a convenient way to list all the A-cumulative strong fences because that currently happens to be all of the strong fences.

Another POV is that one might instead formally describe things in terms of these three more disjoint classes, adapt the documentation of cumul-fence to say that it does not deal with strong fences because those are dealt with in a later relation, and not worry about hypothetical barriers that currently don't have a justifying use case.
(And I suppose to some extent you already don't worry about it, because pb would have to be defined differently if such fences ever made their way into LKMM.)

Now cumul-fence/prop cares about the A-cumulativity aspect, and pb about the strong-fence aspect, but of course the A-cumulativity also appears in pb, just hidden right now through the additional rfe? at the end of prop (if there were non-A-cumulative strong fences, I think it would need to be pb = overwrite & ext ; cumul-fence* ; (A-cumul(...) | ...)). So I don't think one can draw the A-cumulativity aspect delineation between the relations clearly (when allowing for orthogonality). I'm proposing instead to make A-cumulativity a part of being a strong-fence and drawing the strong-fence delineation clearly.


You're fixating on an irrelevancy.

The only thing I'm fixating on is whether it makes sense to remove certain redundancies in LKMM.
And I don't think that's irrelevant; it's about reducing friction and making thinking about LKMM faster in the long run.

This may be so, but I would like to point out that the memory model was
not particularly designed to be as short or as simple as possible, but
more to reflect transparently the intuitions that kernel programmers
have about the ways ordering should work in the kernel. It may very
well include redundancies as a result. I don't think that's a bad
point.
I agree that sometimes redundancies have value. But I think having, where
possible, a kind of minimal responsibility principle where each fence
appears in as few relations as possible also has value.
I've seen that when I try to help people in my team learn to use LKMM: they
see a specific type of fence and get stuck for a while chasing one of its
uses. For example, they may chase a long prop link using the only strong
fence in the example somewhere in the middle, which will then later turn out
to be a dead-end because what they need to use is pb.
People who make that particular mistake should take a lesson from it:
The presence of a strong fence should point them toward looking for an
application of pb before looking at prop, because pb is is based on the
special properties of strong fences whereas prop is not.

Yes, with two caveats:
- you can remove the "before looking at prop" since there's never ever any point of looking at (extending) prop when you have a strong fence
- why give people the opportunity for that mistake in the first place? ...
[...] by making a mistake through pursuing
a redundant pathway, people can deepen their understanding. That's how
one learns.
... I do feel reminded about the discussion about building character : )

Honestly I would easily see your point if there were some uncommon reasons to use the strong fence to extend prop. Then the lesson is what you mentioned: usually, strong fences should be looked at through the pb lense, and only if you get stuck that way it makes sense to look through the prop lense (and probably one could develop a good intuition for which situation calls for which after some time).

But that's not the case here. There's nothing to learn here except that one should pretend that strong-fence didn't exist in prop.
That lesson could also be learned by not having it there in the first place.
And I think you can easily present LKMM in a way that makes this look reasonable.

For someone who doesn't know that we never need to rely on that use to get
ordering, it basically doubles the amount of time spent looking at the graph
and chasing definitions. And for very good reasons there already are alot of
definitions even when redundancies are eliminated.
IMO, people who try to use the memory model this way need to develop a
good understanding of it first.
I disagree; both for the reason you mention later, but also because IMHO the big advantage of a formal model is you don't need to get a good understanding before you can start going and tackling issues.
In German we say "entlanghangeln" which literally means "to make one's way hand over hand along a rope", as a metaphor for following formal definitions carefully step by step when one doesn't yet have a strong intuition to get everything right without needing to look at the formalism; the formalism is kind of a safety line that prevents one from falling into the abyss.
(and in the spirit of what you said below, while doing that with an attentive mind one builds intuition and understanding).

(Although perhaps carrying out these
sorts of exercises is _how_ people develop their understanding...) I
don't regard it as a bad thing that by making a mistake through pursuing
a redundant pathway, people can deepen their understanding. That's how
one learns.

I agree.

Perhaps I would say that including these redundancies is good for
understanding why the formalization makes sense, but excluding them is
better for actually using the formalization.
This includes both when looking at code while having a printout of the model
next to you, but also when trying to reason about LKMM itself (e.g., what
one might do when changes are made to LKMM and one wants to check that they
interact well with the existing parts of LKMM).
Not necessarily so. You might _want_ a change to affect one of the
redundant paths but not the other.

I definitely agree that it might very well be that the redundancy is dissolved at a later point in time through such discovery.
In this case one would have simplified too much : )

But what I mean is, for example, when introducing rmw-sequences a question that came up is whether they are sufficient to ensure monotonicity.

Analyzing stuff like this is much easier on a simplified model (including some other simplifications that are all equivalent to LKMM as written), because there are a lot fewer cases to cover.
It turns a proof that feels like a bookkeeping nightmare into something that I can manage in a few pages of paper.
Similarly when thinking about whether rcu could be understood through a lense that is closer to the Grace Period Guarantee rather than counting gps and rscs, I do this on the simplified model, because those equivalence proofs become easier.

I'm wondering a little if there's some way in the middle, e.g., by writting short comments in the model wherever something is redundant. Something like (* note: strong-fence is included here for completeness, and can be safely ignored *).

Best wishes,
  jonas