Re: [PATCH] Fix data race in mark_rt_mutex_waiters

From: Waiman Long
Date: Thu Jan 26 2023 - 20:47:45 EST



On 1/26/23 17:10, David Laight wrote:
From: Hernan Ponce de Leon
Sent: 26 January 2023 21:07
...
static __always_inline void rt_mutex_clear_owner(struct rt_mutex_base
*lock)
@@ -232,12 +232,7 @@ static __always_inline bool
rt_mutex_cmpxchg_release(struct rt_mutex_base *lock,
*/
static __always_inline void mark_rt_mutex_waiters(struct rt_mutex_base
*lock)
{
- unsigned long owner, *p = (unsigned long *) &lock->owner;
-
- do {
- owner = *p;
- } while (cmpxchg_relaxed(p, owner,
- owner | RT_MUTEX_HAS_WAITERS) != owner);
+ atomic_long_or(RT_MUTEX_HAS_WAITERS, (atomic_long_t *)&lock->owner);
These *(int_type *)&foo accesses (quite often just plain wrong)
made me look up the definitions.

All one big accident waiting to happen...
RT_MUTEX_HAS_WAITERS is defined in a different header to the structure.
The explanatory comment is in a 3rd file.

It would all be safer if lock->owner were atomic_long_t with a comment
that it was the waiting task_struct | RT_MUTEX_HAS_WAITERS.

Given the actual definition is rt_mutex_base_is_locked() even correct?

It is arguable if it should be considered locked if a waiter is waiting but the lock is at an unlock state at the moment. Mutex has a narrower definition of locked while others have a broader one.

Cheers,
Longman