Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

From: Jonas Oberhauser
Date: Thu Jan 26 2023 - 07:19:07 EST




On 1/26/2023 2:53 AM, Paul E. McKenney wrote:
On Wed, Jan 25, 2023 at 08:45:44PM -0500, Alan Stern wrote:
On Wed, Jan 25, 2023 at 03:33:08PM -0800, Paul E. McKenney wrote:
Ah, and returning to the earlier question as to whether srcu_read_unlock()
can use release semantics instead of smp_mb(), at the very least, this
portion of the synchronize_srcu() function's header comment must change:

On systems with more than one CPU, when synchronize_srcu()
returns, each CPU is guaranteed to have executed a full
memory barrier since the end of its last corresponding SRCU
read-side critical section whose beginning preceded the call
to synchronize_srcu().

Of course, there might be code relying on a guarantee that
srcu_read_unlock() executes a full memory barrier. This guarantee would
certainly no longer hold. But as I understand it, this guarantee was
never promised by the SRCU subsystem.
That indented sentence was copied from the synchronize_srcu() function's
header comment, which might be interpreted by some as a promise by the
SRCU subsystem.

I think we understand that it is a promise of the SRCU subsystem, the question is just what the promise is.
As Alan said, if the promise is interpreted as something like

"every store that propagated to the read side critical section must have propagated to all CPUs before the  synchronize_srcu() ends" (where the RSCS and synchronize_srcu() calls are those from the promise)

then that guarantee holds even if you only use a release fence to communicate the end of the RSCS to the GP. Note that this interpretation is analogous to the promise of smp_mb__after_unlock_lock(), which says that an UNLOCK+LOCK pair act as a full fence: here the read-side unlock+gp act as a full memory barrier.

On the other hand, if the promise is more literally interpreted as

"there is a (possibly virtual) instruction in the reader-side execution stream that acts as a full memory barrier, and that instruction is executed before the  synchronize_srcu() ends"

then that guarantee is violated, and I suppose you might be able to write some absurd client that inspects every store of the reader thread and sees that there is no line in the reader side code that acts like a full fence. But it would take a lot of effort to discern this.

Perhaps someone interpreting the promise like this might however come to the conclusion that because the only part of the code that is actually under control of srcu, and hence the only code where that full barrier could be hidden, would be inside the srcu_unlock(), they might expect to always find this full barrier there and treat srcu_unlock() in general as a full barrier. Considering that the wording explicitly isn't "an srcu_unlock() is a full barrier", I hope few people would have this unhealthy idea. But you never know.

Best wishes,
jonas