Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

From: Daniel Lustig
Date: Mon Jul 09 2018 - 13:30:00 EST


On 7/9/2018 9:52 AM, Will Deacon wrote:
> On Fri, Jul 06, 2018 at 02:10:55PM -0700, Paul E. McKenney wrote:
>> On Fri, Jul 06, 2018 at 04:37:21PM -0400, Alan Stern wrote:
>>> On Thu, 5 Jul 2018, Andrea Parri wrote:
>>>
>>>>> At any rate, it looks like instead of strengthening the relation, I
>>>>> should write a patch that removes it entirely. I also will add new,
>>>>> stronger relations for use with locking, essentially making spin_lock
>>>>> and spin_unlock be RCsc.
>>>>
>>>> Thank you.
>>>>
>>>> Ah let me put this forward: please keep an eye on the (generic)
>>>>
>>>> queued_spin_lock()
>>>> queued_spin_unlock()
>>>>
>>>> (just to point out an example). Their implementation (in part.,
>>>> the fast-path) suggests that if we will stick to RCsc lock then
>>>> we should also stick to RCsc acq. load from RMW and rel. store.

Just to be clear, this is "RCsc with W->R exception" again, right?

>>> A very good point. The implementation of those routines uses
>>> atomic_cmpxchg_acquire() to acquire the lock. Unless this is
>>> implemented with an operation or fence that provides write-write
>>> ordering (in conjunction with a suitable release), qspinlocks won't
>>> have the ordering properties that we want.
>>>
>>> I'm going to assume that the release operations used for unlocking
>>> don't need to have any extra properties; only the lock-acquire
>>> operations need to be special (i.e., stronger than a normal
>>> smp_load_acquire). This suggests that atomic RMW functions with acquire
>>> semantics should also use this stronger form of acquire.

It's not clear to me that the burden of enforcing "RCsc with W->R
ordering" should always be placed only on the acquire half.
RISC-V currently places some of the burden on the release half, as
we discussed last week. Specifically, there are a few cases where
fence.tso is used instead of fence rw,w on the release side.

If we always use fence.tso here, following the current recommendation,
we'll still be fine. If LKMM introduces an RCpc vs. RCsc distinction
of some kind, though, I think we would want to distinguish the two
types of release accordingly as well.

>>> Does anybody have a different suggestion?
>>
>> The approach you suggest makes sense to me. Will, Peter, Daniel, any
>> reasons why this approach would be a problem for you guys?
>
> qspinlock is very much opt-in per arch, so we can simply require that
> an architecture must have RCsc RmW atomics if they want to use qspinlock.
> Should an architecture arise where that isn't the case, then we could
> consider an arch hook in the qspinlock code, but I don't think we have
> to solve that yet.
>
> Will

This sounds reasonable to me.

Dan