Re: rcu_read_lock lost its compiler barrier

From: Andrea Parri
Date: Thu Jun 06 2019 - 04:21:11 EST


> This example really does point out a weakness in the LKMM's handling of
> data races. Herbert's litmus test is a great starting point:
>
>
> C xu
>
> {}
>
> P0(int *a, int *b)
> {
> WRITE_ONCE(*a, 1);
> synchronize_rcu();
> *b = 2;
> }
>
> P1(int *a, int *b)
> {
> rcu_read_lock();
> if (READ_ONCE(*a) == 0)
> *b = 1;
> rcu_read_unlock();
> }
>
> exists (~b=2)
>
>
> Currently the LKMM says the test is allowed and there is a data race,
> but this answer clearly is wrong since it would violate the RCU
> guarantee.
>
> The problem is that the LKMM currently requires all ordering/visibility
> of plain accesses to be mediated by marked accesses. But in this case,
> the visibility is mediated by RCU. Technically, we need to add a
> relation like
>
> ([M] ; po ; rcu-fence ; po ; [M])
>
> into the definitions of ww-vis, wr-vis, and rw-xbstar. Doing so
> changes the litmus test's result to "not allowed" and no data race.
> However, I'm not certain that this single change is the entire fix;
> more thought is needed.

This seems a sensible change to me: looking forward to seeing a patch,
on top of -rcu/dev, for further review and testing!

We could also add (to LKMM) the barrier() for rcu_read_{lock,unlock}()
discussed in this thread (maybe once the RCU code and the informal doc
will have settled in such direction).

It seems worth stressing the fact that _neither_ of these changes will
prevent the test below from being racy, considered the two accesses to
"a" happen concurrently / without synchronization.

Thanks,
Andrea

C xu-2

{}

P0(int *a, int *b)
{
*a = 1;
synchronize_rcu();
WRITE_ONCE(*b, 2);
}

P1(int *a, int *b)
{
rcu_read_lock();
if (*a == 0)
WRITE_ONCE(*b, 1);
rcu_read_unlock();
}

exists (~b=2)