Re: [PATCH] tools/memory-model: Make ppo a subrelation of po

From: Jonas Oberhauser
Date: Mon Jan 23 2023 - 09:00:43 EST

Next message: Stephan Gerhold: "Re: [PATCH v4 0/6] Add MSM8939 SoC support with two devices"
Previous message: Li Chen: "Re: [PATCH 03/15] dt-bindings: arm: ambarella: Add binding for Ambarella ARM platforms"
In reply to: Alan Stern: "Re: [PATCH] tools/memory-model: Make ppo a subrelation of po"
Next in thread: Alan Stern: "Re: [PATCH] tools/memory-model: Make ppo a subrelation of po"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 1/21/2023 9:56 PM, Alan Stern wrote:

On Sat, Jan 21, 2023 at 01:41:16AM +0100, Jonas Oberhauser wrote:

On 1/20/2023 5:32 PM, Alan Stern wrote:

On Fri, Jan 20, 2023 at 12:12:50PM +0100, Jonas Oberhauser wrote:

On 1/19/2023 9:06 PM, Alan Stern wrote:

That's backward logic. Being strong isn't the reason the fences are
A-cumulative. Indeed, the model could have been written not to assume
that strong fences are A-cumulative.

It's not just the model, it's also how strong fences are introduced in the
documentation.

The documentation can be updated.

Sure. But what would be the benefit?

Aren't you interested in making the memory model more understandable to
students?

Of course : )

The second one is that a strong-fence is an A-cumulative fence which
additionally has that guarantee.

What I meant is that reading the documentation or the model, one might come
to the conclusion that it means the second thing, and in that interpretation
fences that are strong must be A-cumulative.

Okay, so let's change the documentation/model to ensure this doesn't
happen.

I don't see anything wrong with that, especially since I don't think
strong-fence is a standard term that exists in a vacuum and means the first
thing by convention.

Obviously there's some elegance in making things orthogonal rather than
hierarchical.
So I can understand why you defend the first interpretation.
But there's really only a benefit if that opens up some interesting design
space. I just don't see that right now.

So if hypothetically you were ok to change the meaning of strong fence to
include A-cumulativity -- and I think from the model and documentation
perspective it doesn't take much to do that if anything -- then saying "all
strong-fence properties are covered exactly in pb" isn't a big step.

I believe that the difference between strong and weak fences is much
more fundamental and important than the difference between A-cumulative
and non-A-cumulative fences.

Consider an analogy: Some animals are capable of walking around, but no
plants are. Would you ever want to say that a plant is a non-walking
living thing with various properties that differentiate it from animals?
Or does it make more sense to say that plants are living things with
various fundamental properties, and in addition some animals can walk?

I agree that the difference between (the strict definition of) strong and weak is more important than between A-cumulative and not.
The analogy doesn't work well for me though.
Being A-cumulative is also a form of "strength". Definitely if you replace a non-A-cumulative fence with an otherwise equivalent but A-cumulative one (assume such fences exist), then you can never get more behaviors, but you might get fewer.
I think that's why it's more natural for me to think of strong fences in terms of having multiple strong properties, the chief among which is that it requires "earlier" stores to propagate before po-later instructions execute but another one being A-cumulativity, than it would be of thinking of plants as being things that don't walk and have some other properties.

It's a bit like asking in C11 for a barrier that has the sequential
consistency guarantee of appearing in the global order S, but doesn't have
release or acquire semantics. Yes you could define that, but would it really
make sense?

You're still missing the point. The important aspect of the fences in
cumul-fence is that they are A-cumulative, not whether they are strong.

I completely understand that. I just don't think there's anything
fundamentally wrong with alternatively creating a more disjoint hierarchy of
- fences that aren't A-cumulative but order propagation (in cumul-fence,
without A-cumul)
- fences that are A-cumulative but aren't strong (in cumul-fence, with
A-cumul)
- fences that are strong (in pb)

There is yet another level of fences in the hierarchy: those which order
instruction execution but not propagation (smp_rmb() and acquire). One
of the important points about cumul-fence is that it excludes this
level.

That's for a functional reason -- prop simply doesn't work for those
fences, so it has to exclude them. But it does work for strong fences,
so excluding them would be an artificial restriction.

Hm, so could we say some fences order
1) propagation with propagation (weak fences)
2) execution with execution (rmb, acquire)
3) propagation with execution (strong fences)

where ordering with execution implicitly orders with propagation as well because things can only propagate after they execute.
However, the 4th possibility (execution with only propagation) happens not to exist. I'm not sure if it would even be distinguishable from the second type. In the operational model, can you forward from stores that have not executed yet?

Right now, both pb and cumul-fence deal with strong fences. And again, I

I would say that cumul-fence _allows_ strong fences, or _can work with_
strong fences. And I would never want to say that cumul-fence and prop
can't be used with strong fences. In fact, if you find a situation
where this happens, it might incline you to consider if the fence could
be replaced with a weaker one.

Can you explain the latter part?
What does it mean to 'find a situation where this happens'?
As I understand the sentence, in current LKMM I don't think this is possible.
Do you mean that if you find a case where you could make a cycle with cumul-fence/prop using strong fences, you might just rely on a weaker fence instead?

understand that one point of view here is that this is not because strong
fences need to inherently be A-cumulative and included in cumul-fence by
using the strong-fence identifier.
Indeed one could have, in theory, strong fences that aren't A-cumulative,
and using strong-fence is as you wrote just a convenient way to list all the
A-cumulative strong fences because that currently happens to be all of the
strong fences.

Another POV is that one might instead formally describe things in terms of
these three more disjoint classes, adapt the documentation of cumul-fence to
say that it does not deal with strong fences because those are dealt with in
a later relation, and not worry about hypothetical barriers that currently
don't have a justifying use case.
(And I suppose to some extent you already don't worry about it, because pb
would have to be defined differently if such fences ever made their way into
LKMM.)

Now cumul-fence/prop cares about the A-cumulativity aspect, and pb about the
strong-fence aspect, but of course the A-cumulativity also appears in pb,
just hidden right now through the additional rfe? at the end of prop (if
there were non-A-cumulative strong fences, I think it would need to be pb =
overwrite & ext ; cumul-fence* ; (A-cumul(...) | ...)). So I don't think one

Not quite right. A hypothetical non-A-cumulative case for pb would have
to omit the cumul-fence term entirely.

Wouldn't that violate the transitivity of "X is required to propagate before Y" ?
If I have
X ->cumul-fence+ Y ->weird-strong-fence Z
doesn't that mean that for every CPU C,
1. X is required to propagate to C before Y propagates to C
2. Y is required to propagate to C before any instruction po-after Z executes

But then also X is required to pragate to C before any instruction po-after Z executes.
How is this enforced if there is no cumul-fence* in the new pb?

Thinking about prop and pb along these lines gives me a weird feeling.
Trying to pinpoint it down, it seems a little bit weird that A-cumul doesn't appear around the strong-fence in pb. Of course it should not appear after prop which already has an rfe? at the end. Nevertheless, having the rfe? at the end is clearly important to representing the idea behind prop. If it weren't for the fact that A-cumul is needed to define prop, it almost makes me think that it would be nice to express the difference between A-cumulative and non-A-cumulative fences (that order propagation) by saying that an A-cumulative fence has
prop ; a-cumul-fence;rfe? <= prop
while the non-A-cumulative fence has
prop-without-rfe ; non-a-cumul-fence <= prop-without-rfe

where prop links E and F if there is some coherence-later store after E that propagates to F's CPU by the time F executes, and prop-without-rfe links them if that store propagates to any CPU before F propagates to that CPU. (of course this ignores the interplay between these two relations that happens if you have a mix of a-cumul-fences and non-a-cumul-fences).

can draw the A-cumulativity aspect delineation between the relations clearly
(when allowing for orthogonality). I'm proposing instead to make
A-cumulativity a part of being a strong-fence and drawing the strong-fence
delineation clearly.

Maybe so, in some sense. But in practice you're asking to have strong
fences removed from cumul-fence and prop. I don't want to do that, even
at the cost of some redundancy.

Consider the RB pattern as another example. Suppose the read -> write
ordering on one or both sides is provided by a fence rather than a
dependency or some such. Would you want to keep such cycles out of the
(ppo | rfe)+ part of hb+, on the basis that they also can be covered by
((prop \ id) & int)?

This is the kind of case I mentioned before, where there are still good uses for every individual part (i.e., if you have a fence, it might be important for ppo or prop to complete a circle), and the existance of the fences might just draw one into a more likely direction.
Besides, the model one obtains by trying to remove this redundancy just becomes more complicated, which is the opposite of what I want.

Another good example are the many different fences in cumul-fence, such as in a thread that looks like Rx;mb();Wy;wmb(); Wz... : here one can use either one or two cumul-fence edges (link the Wx that Rx reads from with Wy and then Wy with Wz, or link it directly to Wz), but trying to eliminate this redundancy just makes the model more complicated.

I'm not against this partially overlapping kind of redundancy, but I dislike subsuming kind of redundancy where some branches of the logic just never need to be used.

In fact, I wouldn't mind removing the happens-before, propagation, and
rcu axioms from LKMM entirely, replacing them with the single
executes-before axiom.

I was planning to propose the same thing, however, I would also propose to redefine hb and rb by dropping the hb/pb parts at the end of these relations.

hb = ....
pb = prop ; strong-fence ; [Marked]
rb = prop ; rcu-fence ; [Marked]

xb = hb|pb|rb
acyclic xb

I'm wondering a little if there's some way in the middle, e.g., by writting
short comments in the model wherever something is redundant. Something like
(* note: strong-fence is included here for completeness, and can be safely
ignored *).

I have no objection to doing that. It seems like a good idea.

Alan

Perhaps we can start a new thread then to discuss a few points where redundancies might be annotated this way or eliminated.

Best wishes,
jonas

Next message: Stephan Gerhold: "Re: [PATCH v4 0/6] Add MSM8939 SoC support with two devices"
Previous message: Li Chen: "Re: [PATCH 03/15] dt-bindings: arm: ambarella: Add binding for Ambarella ARM platforms"
In reply to: Alan Stern: "Re: [PATCH] tools/memory-model: Make ppo a subrelation of po"
Next in thread: Alan Stern: "Re: [PATCH] tools/memory-model: Make ppo a subrelation of po"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]