Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

From: Alan Stern
Date: Fri Jul 13 2018 - 21:51:34 EST


On Fri, 13 Jul 2018, Andrea Parri wrote:

> On Fri, Jul 13, 2018 at 10:16:48AM -0700, Linus Torvalds wrote:
> > On Fri, Jul 13, 2018 at 2:34 AM Will Deacon <will.deacon@xxxxxxx> wrote:
> > >
> > > And, since we're stating preferences, I'll reiterate my preference towards:
> > >
> > > * RCsc unlock/lock
> > > * RCpc release/acquire
> >
> > Yes, I think this would be best. We *used* to have pretty heavy-weight
> > locking rules for various reasons, and we relaxed them for reasons
> > that weren't perhaps always the right ones.
> >
> > Locking is pretty heavy-weight in general, and meant to be the "I
> > don't really have to think about this very much" option. Then not
> > being serializing enough to confuse people when it allows odd behavior
> > (on _some_ architectures) does not sound like a great idea.
> >
> > In contrast, when you do release/acquire or any of the other "I know
> > what I'm doing" things, I think we want the minimal serialization
> > implied by the very specialized op.
>
> The changes under discussion are _not_ affecting uses such as:
>
> P0:
> spin_lock(s);
> UPDATE data_struct
> spin_unlock(s);
>
> P1:
> spin_lock(s);
> UPDATE data_struct
> spin_unlock(s);
>
> [...]
>
> (most common use case for locking?): these uses work just _fine_ with
> the current implementations and in LKMM.
>
> OTOH, these changes are going to affect uses where threads interact by
> "mixing" locking and _other_ synchronization primitives such as in:
>
> { x = 0; y = 0; }
>
> P0:
> spin_lock(s);
> WRITE_ONCE(x, 1);
> spin_unlock(s);
>
> P1:
> spin_lock(s);
> r0 = READ_ONCE(x);
> WRITE_ONCE(y, 1);
> spin_unlock(s);
>
> P2:
> r1 = smp_load_acquire(&y);
> r2 = READ_ONCE(x);
>
> BUG_ON(r0 == 1 && r1 == 1 && r2 == 0)
>
> and
>
> { x = 0; y = 0; z = 0; }
>
> P0:
> spin_lock(s);
> WRITE_ONCE(x, 1);
> r0 = READ_ONCE(y);
> spin_unlock(s);
>
> P1:
> spin_lock(s);
> WRITE_ONCE(y, 1);
> r1 = READ_ONCE(z);
> spin_unlock(s);
>
> P2
> WRITE_ONCE(z, 1);
> smp_mb();
> r2 = READ_ONCE(x);
>
> BUG_ON(r0 == 0 && r1 == 0 && r2 == 0)
>
> (inspired from __two__ uses in kernel/{sched,rcu}). Even if someone were
> to tell me that locks serialize enough, I'd still be prompted to say "yes,
> but do / can my BUG_ON()s fire?".

The point being that the scenarios under discussion in this thread all
fall most definitely into the "Non-standard usage; you'd better know
exactly what you're doing" category.

Which suggests, by Linus's reasoning, that locking should be as
lightweight as possible while still being able to perform its basic job
of defining critical sections. In other words, RCpc.

And which would still leave smp_mb__after_unlock_lock available for
more esoteric usages. Although it provides RCsc ordering, I assume the
overhead wouldn't be prohibitive in situations where only RCtso
ordering is needed.

Alan

> Actually, my very first reaction, before starting what does appear to be
> indeed a long and complex conversation, would probably be to run/check the
> above snippets against the (latest) LKMM, by using the associated tool.
>
> Once "checked" with both people and automated models, I'd probably remain
> suspicious about my "magic" code so that I most likely will be prompted to
> dig into each single arch. implementation / reference manual...
>
> ... Time's up!
>
> Andrea