Re: [PATCH] Document Linux's memory barriers [try #5]

From: David Howells
Date: Thu Mar 30 2006 - 15:15:46 EST


Suzanne Wood <suzannew@xxxxxxxxxx> wrote:

> Do you mean to formalize preconditions on the value of Y and contents of A
> and consider postconditions after the execution of the three statements of
> the example where the value of X is the prior content of A and A contains and
> Z equals the value of Y.

Something like that, yes.

How about the attached instead? My explanation didn't give a write/write
example, so I've added that in too.

> Sorry to be unclear. I was just asking about the explanation of the
> self-consistent CPU example. The other ideas in the document are more
> difficult, so thought this part might be simplified.

The whole subject is fraught with unclarity:-)

David


However, it is guaranteed that a CPU will be self-consistent: it will see its
_own_ accesses appear to be correctly ordered, without the need for a memory
barrier. For instance with the following code:

U = *A;
*A = V;
*A = W;
X = *A;
*A = Y;
Z = *A;

and assuming no intervention by an external influence, it can be taken that the
final result will appear to be:

U == the original value of *A
X == W
Z == Y
*A == Y

The code above may cause the CPU to generate the full sequence of memory
accesses:

U=LOAD *A, STORE *A=V, STORE *A=W, X=LOAD *A, STORE *A=Y, Z=LOAD *A

But, without intervention, the sequence may have almost any combination of
elements barring the load of *A into U and the store of Y into *A discarded as
the CPU may combine or discard memory accesses as it sees fit, provided _its_
view of the world is consistent. This leads to the following possible other
sequences:

U=LOAD *A, STORE *A=V, STORE *A=W, X=LOAD *A, STORE *A=Y
U=LOAD *A, STORE *A=V, STORE *A=W, STORE *A=Y, Z=LOAD *A
U=LOAD *A, STORE *A=V, STORE *A=W, STORE *A=Y
U=LOAD *A, STORE *A=V, STORE *A=Y, Z=LOAD *A
U=LOAD *A, STORE *A=V, STORE *A=Y
U=LOAD *A, STORE *A=W, X=LOAD *A, STORE *A=Y
U=LOAD *A, STORE *A=W, STORE *A=Y, Z=LOAD *A
U=LOAD *A, STORE *A=W, STORE *A=Y
U=LOAD *A, STORE *A=Y, Z=LOAD *A
U=LOAD *A, STORE *A=Y

Note:

(*) STORE *A=W can only be dispensed with if X=LOAD *A is also dispensed with,
otherwise X would be given the original value of *A or the value assigned
there from V, and not the value assigned there from W.

(*) STORE *A=V would probably be discarded entirely by the CPU - if not the
compiler - as it's effect is overridden by STORE *A=W without anything
seeing it.

(*) STORE *A=Y may be deferred indefinitely until the CPU has some reason to
perform it or discard it.

(*) Even the two 'fixed' operations can be dispensed with if they can be
combined or discarded with respect to the surrounding code.

(*) The compiler may also combine, discard or defer elements of the sequence
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/