Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

From: Jonas Oberhauser
Date: Wed Jan 25 2023 - 12:43:47 EST




On 1/25/2023 6:18 PM, Paul E. McKenney wrote:
On Wed, Jan 25, 2023 at 10:34:40AM -0500, Alan Stern wrote:
On Wed, Jan 25, 2023 at 07:05:20AM -0800, Paul E. McKenney wrote:
On Wed, Jan 25, 2023 at 02:10:08PM +0100, Jonas Oberhauser wrote:

On 1/25/2023 3:20 AM, Paul E. McKenney wrote:
On Tue, Jan 24, 2023 at 08:54:56PM -0500, Alan Stern wrote:
On Tue, Jan 24, 2023 at 02:54:49PM -0800, Paul E. McKenney wrote:
Within the Linux kernel, the rule for a given RCU "domain" is that if
an event follows a grace period in pretty much any sense of the word,
then that event sees the effects of all events in all read-side critical
sections that began prior to the start of that grace period.

Here the senses of the word "follow" include combinations of rf, fr,
and co, combined with the various acyclic and irreflexive relations
defined in LKMM.
The LKMM says pretty much the same thing. In fact, it says the event
sees the effects of all events po-before the unlock of (not just inside)
any read-side critical section that began prior to the start of the
grace period.

And are these anything the memory model needs to worry about?
Given that several people, yourself included, are starting to use LKMM
to analyze the Linux-kernel RCU implementations, maybe it does.

Me, I am happy either way.
Judging from your description, I don't think we have anything to worry
about.
Sounds good, and let's proceed on that assumption then. We can always
revisit later if need be.

Thanx, Paul
FWIW, I currently don't see a need for either RCU nor "base" LKMM to have
this kind of guarantee.
In the RCU case, it is because it is far easier to provide this guarantee,
even though it is based on hardware and compilers rather than LKMM,
than it would be to explain to some random person why the access that
is intuitively clearly after the grace period can somehow come before it.

But I'm curious for why it doesn't exist in LKMM -- is it because of Alpha
or some other issues that make it hard to guarantee (like a compiler merging
two threads and optimizing or something?), or is it simply that it seemed
like a complicated guarantee with no discernible upside, or something else?
Because to the best of my knowledge, no one has ever come up with a
use for 2+2W and friends that isn't better handled by some much more
straightforward pattern of accesses. So we did not guarantee it in LKMM.

Yes, you could argue that my "ease of explanation" paragraph above is
a valid use case, but I am not sure that this is all that compelling of
an argument. ;-)
Are we all talking about the same thing? There were two different
guarantees mentioned above:

The RCU guarantee about writes in a read-side critical section
becoming visible to all CPUs before a later grace period ends;

The guarantee about the 2+2W pattern and friends being
forbidden.

The LKMM includes the first of these but not the second (for the reason
Paul stated).
I am not sure whether or not we are talking about the same thing,
but given this litmus test:

------------------------------------------------------------------------

C C-srcu-observed-4

(*
* Result: Sometimes
*
* The Linux-kernel implementation is suspected to forbid this.
*)

{}

P0(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

r1 = srcu_read_lock(s);
WRITE_ONCE(*y, 2);
WRITE_ONCE(*x, 1);
srcu_read_unlock(s, r1);
}

P1(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

WRITE_ONCE(*y, 1);
synchronize_srcu(s);
WRITE_ONCE(*z, 2);
}

P2(int *x, int *y, int *z, struct srcu_struct *s)
{
WRITE_ONCE(*z, 1);
smp_store_release(x, 2);
}

exists (x=1 /\ y=1 /\ z=1)

------------------------------------------------------------------------

We get the following from herd7:

------------------------------------------------------------------------

$ herd7 -conf linux-kernel.cfg C-srcu-observed-4.litmus
Test C-srcu-observed-4 Allowed
States 8
x=1; y=1; z=1;
x=1; y=1; z=2;
x=1; y=2; z=1;
x=1; y=2; z=2;
x=2; y=1; z=1;
x=2; y=1; z=2;
x=2; y=2; z=1;
x=2; y=2; z=2;
Ok
Witnesses
Positive: 1 Negative: 7
Condition exists (x=1 /\ y=1 /\ z=1)
Observation C-srcu-observed-4 Sometimes 1 7
Time C-srcu-observed-4 0.02
Hash=8b6020369b73ac19070864a9db00bbf8

------------------------------------------------------------------------

This does not seem to me to be consistent with your "The RCU guarantee
about writes in a read-side critical section becoming visible to all
CPUs before a later grace period ends".

I believe the issue is a different one, it's about the prop;prop at the end, not related to the grace period guarantee. The stores in the CS become visible, but the store release never propagates anywhere, since the co-later store from the CS already propagated everywhere.
I believe this is because A ->prop B ->prop C only says that there are writes WB and WC such that WB propagates to B's CPU before B executes, WC is co-after B, and WC propagates to C's CPU before C executes. (I think B is the release store here).

But it does not say anything about the propagation/execution order of B and WC, and I believe WC can propagate to every CPU (other than B's) before B, and B never propagates anywhere.

Again, I am OK with LKMM allowing C-srcu-observed-4.litmus, as long as
the actual Linux-kernel implementation forbids it.

Is it really that important that the implementation forbids it? Do you have a use case?

Best wishes, jonas