Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

From: Jonas Oberhauser
Date: Thu Jan 19 2023 - 06:23:36 EST

Next message: Andy Shevchenko: "Re: [PATCH v1 1/2] iio: core: Replace iio_sysfs_match_string_with_gaps() by __sysfs_match_string()"
Previous message: Jerome Brunet: "Re: [PATCH V6 0/3] Add S4 SoC PLL and Peripheral clock controller"
In reply to: Alan Stern: "Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)"
Next in thread: Alan Stern: "Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 1/19/2023 3:28 AM, Alan Stern wrote:

> This is a permanent error; I've given up. Sorry it didn't
work out.

[It seems the e-mail still reached me through the mailing list]

hopefully with just one main thought per email! :-)

Honestly, I wish I could. I'm already trying hard but sometimes things are interconnected, and my brain isn't good at withholding information -- including incorrect and incomprehensible information :D

On Wed, Jan 18, 2023 at 12:25:05PM +0100, Jonas Oberhauser wrote:

On 1/17/2023 10:19 PM, Alan Stern wrote:

On Tue, Jan 17, 2023 at 06:48:12PM +0100, Jonas Oberhauser wrote:

Pretending for simplicity that rscs and grace periods aren't reads&writes

They aren't. You don't have to pretend.

rscs are reads& writes in herd. That's how the data dependency works in your
patch, right?

No, you're mixing up RCU and SRCU.

Yes, I meant for the srcu : ) Any argument I'm trying to make just for rcu right now, will need to still work for srcu later.

I consider that a hack though and don't like it.

It _is_ a bit of a hack, but not a huge one. srcu_read_lock() really
is a lot like a load, in that it returns a value obtained by reading
something from memory (along with some other operations, though, so it
isn't a simple straightforward read -- perhaps more like an
atomic_inc_return_relaxed).

The issue I have with this is that it might create accidental ordering. How does it behave when you throw fences in the mix?
It really does not work like an increment at all, I think srcu_read_lock() only reads the currently active index, but the index is changed by srcu_sync. But even that is an implementation detail of sorts. I think the best way to think of it would be for srcu_read_lock to just return an arbitrary value.
The user can not rely on any kind of "accidental" rfe edges between these events for ordering.

Perhaps if you flag any use of these values in address or control dependencies, as well as any event which depends on more than one of these values, you could prove that it's impossible to contrain the behavior through these rfe(and/or co) edges because you can anyways never inspect the value returned by the operation (except to pass it into srcu_unlock).

Or you might be able to explicitly eliminate the events everywhere, just like you have done for carry-dep in your patch.

But it looks so brittle.

srcu_read_unlock() is somewhat less like a store, but it does have one
notable similarity: It takes an input value and therefore can be the
target of a data dependency. The biggest difference is that an
srcu_read_unlock() can't really be read-from. It would be nice if herd
had an event type that behaved this way.

Or if you could declare your own : )
Obviously, you will have accidental rf edges going from srcu_read_unlock() to srcu_read_lock() if you model them this way.

Also, herd doesn't have any way of saying that the value passed to a
store is an unmodified copy of the value obtained by a load. In our
case that doesn't matter much -- nobody should be writing litmus tests
in which the value returned by srcu_read_lock() is incremented and then
decremented again before being passed to srcu_write_lock()!

It would be nice if herd allowed declaring structs that can be used for such purposes.
(anyways, I am not sure if Luc is still following everything in this deeply nested thread that started somewhere completely different. But maybe if Paul or you open a feature request, let me know so that I can give my 2ct)

By golly, you're right! I'm still thinking in terms of an older
version of the memory model, which used gp in place of rcu-gp.

I'm glad I'm not the only person to mix these two up xP

Note that if your ordering relies on actually using gp twice in a row, then
these must come from strong-fence, but you should be able to just take the
shortcut by merging them into a single gp.
po;rcu-gp;po;rcu-gp;po <= gp <= strong-fence <= hb & strong-order

I don't know what you mean by this. The example above does rely on
having two synchronize_rcu() calls; with only one it would be allowed.

I mean that if you have a cycle that is formed by having two adjacent actual `gp` edges, like .... ; gp;gp ; .... with gp= po ; rcu-gp ; po?,
(not like your example, where the cycle uses two *rcu*-gp but no gp edges) and assume we define gp' = po ; rcu-gp ; po and hb' and pb' to use gp' instead of gp,
then there are two cases for how that cycle came to be, either 1) as
... ; hb;hb ; ....
but then you can refactor as
... ; po;rcu-gp;po;rcu-gp;po ; ...
... ; po;rcu-gp; po    ; ...
... ;         gp'            ; ...
... ;         hb'            ; ...
which again creates a cycle, or 2) as
... ; pb ; hb ; ...
coming from
... ; prop ; gp ; gp ; ....
which you can similarly refactor as
... ; prop ; po;rcu-gp;po ; ....
... ; prop ;      gp' ; ....
and again get a cycle with
... ; pb' ; ....
Therefore, gp = po;rcu-gp;po should be equivalent.

I don't think rcu-order is necessary at all to define LKMM, and one can
probably just use rcu-extend instead of rcu-order (and in fact even a
version of rcu-extend without any lone rcu-gps).

Sure, you could do that, but it wouldn't make sense. Why would anyone
want to define an RCU ordering relation that includes

gp ... rscs ... gp ... rscs

but not

gp ... rscs ... rscs ... gp

?

Because the the RCU Grace Period Guarantee doesn't say "if a gp happens
before a gp, with some rscs in between, ...".
So I think even the picture is not the best picture to draw for RCU
ordering. I think the right picture to draw for RCU ordering is something
like this:
case (1): C ends before G does:

rcsc ... ... ... ... ... gp

case (2): G ends before C does:

gp ... ... ... ... ... rscs

where the dots are some relation that means "happens before".

Okay. So we could define rcu-order by:

let rec rcu-order = (rcu-gp ; rcu-link ; (rcu-order ; rcu-link)* ; rcu-rscsi) |
(rcu-rscsi ; rcu-link ; (rcu-order ; rcu-link)* ; rcu-gp)

(ignoring the SRCU cases). That is a little awkward; it might make
sense to factor out (rcu-link ; (rcu-order ; rcu-link)*) as a separate
relation and do a simultaneous recursion on both relations.

Exactly!

But either way, rcu-fence would have to be defined as (po ; rcu-order+ ; po?),
which looks a little odd.

Almost, (assuming the rb definition is changed to be like the pb one*) it would be

rcu-fence = (po ; (rcu-order ; po)+).

Alernatively, I believe (but haven't fully confirmed) it would also work to define
rcu-fence = po ; rcu-order ; po?

This is why I am wondering whether
order = po ; {inducing operation} ; po?
is ok in general or not.

Have fun,
jonas

(*= otherwise you'd need to include the rcu-link in here as well)

Next message: Andy Shevchenko: "Re: [PATCH v1 1/2] iio: core: Replace iio_sysfs_match_string_with_gaps() by __sysfs_match_string()"
Previous message: Jerome Brunet: "Re: [PATCH V6 0/3] Add S4 SoC PLL and Peripheral clock controller"
In reply to: Alan Stern: "Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)"
Next in thread: Alan Stern: "Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]