Re: [PATCH] Fix data race in mark_rt_mutex_waiters

From: Hernan Ponce de Leon
Date: Thu Jan 26 2023 - 16:07:59 EST


On 1/26/2023 1:20 PM, Peter Zijlstra wrote:
On Thu, Jan 26, 2023 at 10:42:07AM +0100, Hernan Ponce de Leon wrote:
On 1/24/2023 5:04 PM, Waiman Long wrote:

On 1/24/23 10:52, Peter Zijlstra wrote:
On Tue, Jan 24, 2023 at 10:42:24AM -0500, Waiman Long wrote:

I would suggest to do it as suggested by PeterZ. Instead of set_bit(),
however, it is probably better to use atomic_long_or() like

atomic_long_or_relaxed(RT_MUTEX_HAS_WAITERS, (atomic_long_t
*)&lock->owner)
That function doesn't exist, atomic_long_or() is implicitly relaxed for
not returning a value.

You are right. atomic_long_or() doesn't have variants like some others.

Cheers,
Longman


When you say "replace the whole of that function", do you mean "barrier
included"? I argue in the other email that I think this should not affect
correctness (at least not obviously), but removing the barrier is doing more
than just fixing the data race as this patch suggests.

Well, set_bit() implies smp_mb(), atomic_long_or() does not and would
need to retain the barrier.

That said, the comments are all batshit. The top comment states relaxed
ordering is suffient since holding lock, the comment with the barrier
utterly fails to explain what it's ordering against.

I think the top comment became obsolete after 1c0908d8e441 and this just went unnoticed. I agree the comment with the barrier does not say much and getting some more detailed information was one of the goals of my other email.


So all that would need to be updated as well.

That said, looking at 1c0908d8e441 I'm not at all sure we need that
barrier. Even in the try_to_take_rt_mutex(.waiter=NULL) case, where we
skip over the task->pi_lock region, rt_mutex_set_owner(.acquire=true)
will be ACQUIRE.

This sentence states in a clear way the idea I was trying to express in my other email about why the barrier is not necessary. I think the same argument holds if we keep the barrier and relax the store in rt_mutex_set_owner as suggested by Boqun (see patch below).


And in case of rt_mutex_owner(), we fail the trylock (return with 0) and
a failed trylock does not imply any ordering.

So why are we having this barrier?

I run again the verification with the following patch (I am aware the comments still need to be updated, this was just to be able to run the tool) and the tool still finds no violation.

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 010cf4e6d0b8..c62e409906a2 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -107,7 +107,7 @@ rt_mutex_set_owner(struct rt_mutex_base *lock, struct task_struct *owner)
* lock->wait_lock is held but explicit acquire semantics are needed
* for a new lock owner so WRITE_ONCE is insufficient.
*/
- xchg_acquire(&lock->owner, rt_mutex_owner_encode(lock, owner));
+ WRITE_ONCE(lock->owner, rt_mutex_owner_encode(lock, owner));
}

static __always_inline void rt_mutex_clear_owner(struct rt_mutex_base *lock)
@@ -232,12 +232,7 @@ static __always_inline bool rt_mutex_cmpxchg_release(struct rt_mutex_base *lock,
*/
static __always_inline void mark_rt_mutex_waiters(struct rt_mutex_base *lock)
{
- unsigned long owner, *p = (unsigned long *) &lock->owner;
-
- do {
- owner = *p;
- } while (cmpxchg_relaxed(p, owner,
- owner | RT_MUTEX_HAS_WAITERS) != owner);
+ atomic_long_or(RT_MUTEX_HAS_WAITERS, (atomic_long_t *)&lock->owner);

/*
* The cmpxchg loop above is relaxed to avoid back-to-back ACQUIRE
--