Re: rtmutex, pi_blocked_on, and blk_flush_plug()

From: Thomas Gleixner
Date: Mon Feb 20 2023 - 13:21:59 EST


On Mon, Feb 20 2023 at 12:42, Sebastian Andrzej Siewior wrote:
> On 2023-02-20 12:04:56 [+0100], To Thomas Gleixner wrote:
>> The ->pi_blocked_on field is set by __rwbase_read_lock() before
>> schedule() is invoked while blocking on the sleeping lock. By doing this
>> we avoid __blk_flush_plug() and as such will may deadlock because we are
>> going to sleep and made I/O progress earlier which is not globally
>> visibly but might be (s/might be/is/ in the deadlock case) expected by
>> the owner of the lock.

Fair enough.

> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -1700,6 +1700,13 @@ static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock,
> if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
> return 0;
>
> + if (state != TASK_RTLOCK_WAIT) {
> + /*
> + * If we are going to sleep and we have plugged IO queued,
> + * make sure to submit it to avoid deadlocks.
> + */
> + blk_flush_plug(tsk->plug, true);

This still leaves the problem vs. io_wq_worker_sleeping() and it's
running() counterpart after schedule().

Aside of that for CONFIG_DEBUG_RT_MUTEXES=y builds it flushes on every
lock operation whether the lock is contended or not.

Grmbl.