RE: [PATCH] rcu: Only boost rcu reader tasks with lower priority than boost kthreads

From: Zhang, Qiang1
Date: Sun Mar 06 2022 - 21:03:41 EST


On 3/4/2022 2:56 PM, Zqiang wrote:
> When RCU_BOOST is enabled, the boost kthreads will boosting readers
> who are blocking a given grace period, if the current reader tasks
> have a higher priority than boost kthreads(the boost kthreads priority
> not always 1, if the kthread_prio is set), boosting is useless, skip
> current task and select next task to boosting, reduce the time for a
> given grace period.
>
> Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>
> ---
> kernel/rcu/tree_plugin.h | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index c3d212bc5338..d35b6da66bbd 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -12,6 +12,7 @@
> */
>
> #include "../locking/rtmutex_common.h"
> +#include <linux/sched/deadline.h>
>
> static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
> {
> @@ -1065,13 +1066,20 @@ static int rcu_boost(struct rcu_node *rnp)
> * section.
> */
> t = container_of(tb, struct task_struct, rcu_node_entry);
> + if (!rnp->exp_tasks && (dl_task(t) || t->prio <= current->prio)) {
> + tb = rcu_next_node_entry(t, rnp);
> + WRITE_ONCE(rnp->boost_tasks, tb);
> + raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> + goto end;
> + }
> +
> rt_mutex_init_proxy_locked(&rnp->boost_mtx.rtmutex, t);
> raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> /* Lock only for side effect: boosts task t's priority. */
> rt_mutex_lock(&rnp->boost_mtx);
> rt_mutex_unlock(&rnp->boost_mtx); /* Then keep lockdep happy. */
> rnp->n_boosts++;
> -
> +end:
>>
>>Nit: maybe rename the label to "skip_boost:" ?
>>
>>Code looks fine; however, out of curiosity; given that the higher
>>priority tasks, in general, would exit their read side critical section
>>quickly and boost the next blocking reader on exiting their read side
>>section; do you see noticeable reduction in grace period timings with
>>the change for certain type of workloads?

Thanks for feedback , In preempt-RT systems, there will be many real-time threads (most
of them are created by users themselves ), their priority is higher or lower than boost kthreads
(kthread_prio is set), for rt tasks with higher priority than boost kthreads, maybe it will exit
read side critical quickly, maybe not, if it is preempted by a higher priority task, If try to boost operation,
this increases the boosts kthread waiting time, as a result, the next blkd tasks cannot be
boosted in time. of course, I don't deny it, there are also reasons that user priority setting is inappropriate.

Thanks
Zqiang

>>
>>
>>Thanks
>>Neeraj

> return READ_ONCE(rnp->exp_tasks) != NULL ||
> READ_ONCE(rnp->boost_tasks) != NULL;
> }