Re: Uses for memory barriers

From: Paul E. McKenney
Date: Mon Sep 11 2006 - 12:17:17 EST


On Fri, Sep 08, 2006 at 10:25:45PM -0400, Alan Stern wrote:
> On Sat, 9 Sep 2006, Oliver Neukum wrote:
>
> > > But what _is_ the formal definition of a memory barrier? I've never seen
> > > one that was complete and correct.
> >
> > I' d say "mb();" is "rmb();wmb();"
> >
> > and they work so that:
> >
> > CPU 0
> >
> > a = TRUE;
> > wmb();
> > b = TRUE;
> >
> > CPU 1
> >
> > if (b) {
> > rmb();
> > assert(a);
> > }
> >
> > is correct. Possibly that is not a complete definition though.
>
> It isn't. Paul has agreed that this assertion:
>
> CPU 0 CPU 1
> ----- -----
> while (x == 0) relax(); x = -1;
> x = a; y = b;
> mb(); mb();
> b = 1; a = 1;
> while (x < 0) relax();
> assert(x==0 || y==0);
>
> will not fail. I think this would not be true if either of the mb()
> statements were replaced with {rmb(); wmb();}.
>
> To put it another way, {rmb(); wmb();} guarantees that any preceding read
> will complete before any following read and any preceding write will
> complete before any following write. However it does not guarantee that
> any preceding read will complete before any following write, whereas mb()
> does guarantee that. (To whatever extent these statements make sense.)

This is a summary of the Linux memory-barrier semantics as I understand
them:

1. A given CPU will always perceive its own memory operations
as occuring in program order.

2. All stores to a given single memory location will be perceived
as having occurred in the same order by all CPUs. This is
"coherence". (And this is the property that I was forgetting
about when I first looked at your second example.)

3. A given type of memory barrier, when executed on a given CPU,
causes that CPU's prior accesses of the corresponding type to
be perceived by other CPUs as having occurred before the given
CPU's subsequent accesses of the corresponding type.

The types of memory barriers are rmb(), which segregates only
reads, wmb(), which segregates only writes, and mb(), which
segregates both. There is also mmiowb(), which is like wmb(),
but gives additional ordering guarantees that extend to I/O
busses, such as PCI bridges.

Alan is quite correct when he says that rmb();wmb(); is not necessarily
equivalent to mb(). For example:

Sequence 1 Sequence 2

load A load A
store B store B
mb() rmb();wmb()
load C load C
store D store D

In sequence 1, other CPUs will see the load from A and the store to B both
preceding the load from C and the store to D. In sequence 2, other CPUs
might well see the store to D preceding the load from A, or, conversely,
the store to B following the load from C. This second scenario might seem
unlikely, but there is real hardware that has similar properties (e.g.,
ppc's separately ordering accesses to cached and to non-cached memory).

In all these cases, of course, these other CPUs would themselves be needing
to use memory barriers to order their own accesses.

You guys asked!!!

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/