Re: [PATCH v2] sched/core: Always flush pending blk_plug

From: Peter Zijlstra
Date: Fri Jul 08 2022 - 09:54:15 EST


On Fri, Jul 08, 2022 at 10:32:12AM +0100, John Keeping wrote:

> It seems that the intent here is to skip blk_flush_plug() in the case
> where a non-preemptible lock (such as a spinlock) has been converted to
> a rtmutex on RT, which is the case covered by the SM_RTLOCK_WAIT
> schedule flag. But sched_submit_work() is only called from schedule()
> which is never called in this scenario, so the check can simply be
> deleted.

> include/linux/sched/rt.h | 8 --------
> kernel/sched/core.c | 3 ---
> 2 files changed, 11 deletions(-)
>
> diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
> index e5af028c08b49..994c25640e156 100644
> --- a/include/linux/sched/rt.h
> +++ b/include/linux/sched/rt.h
> @@ -39,20 +39,12 @@ static inline struct task_struct *rt_mutex_get_top_task(struct task_struct *p)
> }
> extern void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task);
> extern void rt_mutex_adjust_pi(struct task_struct *p);
> -static inline bool tsk_is_pi_blocked(struct task_struct *tsk)
> -{
> - return tsk->pi_blocked_on != NULL;
> -}
> #else
> static inline struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
> {
> return NULL;
> }
> # define rt_mutex_adjust_pi(p) do { } while (0)
> -static inline bool tsk_is_pi_blocked(struct task_struct *tsk)
> -{
> - return false;
> -}
> #endif
>
> extern void normalize_rt_tasks(void);

Excellent, glad to see the back of that.

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 1d4660a1915b3..e4974fe003b5b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6578,9 +6578,6 @@ static inline void sched_submit_work(struct task_struct *tsk)
> io_wq_worker_sleeping(tsk);
> }
>
> - if (tsk_is_pi_blocked(tsk))
> - return;
> -

Would it make sense to replace this with:

SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);

Along with a comment along the lines of:

> - spinlock and rwlock must not flush block requests. This will deadlock
> if the callback attempts to acquire a lock which is already acquired.
> Similarly to being preempted, there should be no warning if the
> scheduling point is within a RCU read section.