Re: Documentation for plain accesses and data races

From: Boqun Feng
Date: Mon Sep 16 2019 - 07:53:08 EST


On Mon, Sep 16, 2019 at 01:17:53PM +0800, Boqun Feng wrote:
> Hi Alan,
>
> I spend some time reading this, really helpful! Thanks.
>
> Please see comments below:
>
> On Fri, Sep 06, 2019 at 02:11:29PM -0400, Alan Stern wrote:
> [...]
> > If two memory accesses aren't concurrent then one must execute before
> > the other. Therefore the LKMM decides two accesses aren't concurrent
> > if they can be connected by a sequence of hb, pb, and rb links
> > (together referred to as xb, for "executes before"). However, there
> > are two complicating factors.
> >
> > If X is a load and X executes before a store Y, then indeed there is
> > no danger of X and Y being concurrent. After all, Y can't have any
> > effect on the value obtained by X until the memory subsystem has
> > propagated Y from its own CPU to X's CPU, which won't happen until
> > some time after Y executes and thus after X executes. But if X is a
> > store, then even if X executes before Y it is still possible that X
> > will propagate to Y's CPU just as Y is executing. In such a case X
> > could very well interfere somehow with Y, and we would have to
> > consider X and Y to be concurrent.
> >
> > Therefore when X is a store, for X and Y to be non-concurrent the LKMM
> > requires not only that X must execute before Y but also that X must
> > propagate to Y's CPU before Y executes. (Or vice versa, of course, if
> > Y executes before X -- then Y must propagate to X's CPU before X
> > executes if Y is a store.) This is expressed by the visibility
> > relation (vis), where X ->vis Y is defined to hold if there is an
> > intermediate event Z such that:
> >
> > X is connected to Z by a possibly empty sequence of
> > cumul-fence links followed by an optional rfe link (if none of
> > these links are present, X and Z are the same event),
> >
>
> I wonder whehter we could add an optional ->coe or ->fre between X and
> the possibly empty sequence of cumul-fence, smiliar to the definition
> of ->prop. This makes sense because if we have
>
> X ->coe X' (and X' in on CPU C)
>
> , X must be already propagated to C before X' executed, according to our
> operation model:
>
> "... In particular, it must arrange for the store to be co-later than
> (i.e., to overwrite) any other store to the same location which has
> already propagated to CPU C."
>
> In other words, we can define ->vis as:
>
> let vis = prop ; ((strong-fence ; [Marked] ; xbstar) | (xbstar & int))
>

Hmm.. so the problem with this approach is that the (xbstar & int) part
doesn't satisfy the requirement of visibility... i.e.

X ->prop Z ->(xbstar & int) Y

may not guarantee when Y executes, X is already propagated to Y's CPU.

Lemme think more on this...

Regards,
Boqun

> , and for this document, reference the "prop" section in
> explanation.txt. This could make the model simple (both for description
> and explanation), and one better thing is that we won't need commit in
> Paul's dev branch:
>
> c683f2c807d2 "tools/memory-model: Fix data race detection for unordered store and load"
>
> , and data race rules will look more symmetrical ;-)
>
> Thoughts? Or am I missing something subtle here?
>
> Regards,
> Boqun
>
> > and either:
> >
> > Z is connected to Y by a strong-fence link followed by a
> > possibly empty sequence of xb links,
> >
> > or:
> >
> > Z is on the same CPU as Y and is connected to Y by a possibly
> > empty sequence of xb links (again, if the sequence is empty it
> > means Z and Y are the same event).
> >
> > The motivations behind this definition are straightforward:
> >
> > cumul-fence memory barriers force stores that are po-before
> > the barrier to propagate to other CPUs before stores that are
> > po-after the barrier.
> >
> > An rfe link from an event W to an event R says that R reads
> > from W, which certainly means that W must have propagated to
> > R's CPU before R executed.
> >
> > strong-fence memory barriers force stores that are po-before
> > the barrier, or that propagate to the barrier's CPU before the
> > barrier executes, to propagate to all CPUs before any events
> > po-after the barrier can execute.
> >
> > To see how this works out in practice, consider our old friend, the MP
> > pattern (with fences and statement labels, but without the conditional
> > test):
> >
> > int buf = 0, flag = 0;
> >
> > P0()
> > {
> > X: WRITE_ONCE(buf, 1);
> > smp_wmb();
> > W: WRITE_ONCE(flag, 1);
> > }
> >
> > P1()
> > {
> > int r1;
> > int r2 = 0;
> >
> > Z: r1 = READ_ONCE(flag);
> > smp_rmb();
> > Y: r2 = READ_ONCE(buf);
> > }
> >
> > The smp_wmb() memory barrier gives a cumul-fence link from X to W, and
> > assuming r1 = 1 at the end, there is an rfe link from W to Z. This
> > means that the store to buf must propagate from P0 to P1 before Z
> > executes. Next, Z and Y are on the same CPU and the smp_rmb() fence
> > provides an xb link from Z to Y (i.e., it forces Z to execute before
> > Y). Therefore we have X ->vis Y: X must propagate to Y's CPU before Y
> > executes.
> >
> [...]


Attachment: signature.asc
Description: PGP signature