Re: [PATCH] sched: Fix rq nr_uninterruptible count

From: Nikolay Borisov
Date: Tue Feb 28 2023 - 04:38:21 EST




On 28.02.23 г. 10:46 ч., zhenggy wrote:
When an uninterrptable task is queue to a differect cpu as where
it is dequeued, the rq nr_uninterruptible will be incorrent, so
fix it.

Signed-off-by: GuoYong Zheng <zhenggy@xxxxxxxxxxxxxxx>


37 * - cpu_rq()->nr_uninterruptible isn't accurately tracked per-CPU because
38 * this would add another cross-CPU cacheline miss and atomic operation
39 * to the wakeup path. Instead we increment on whatever CPU the task ran
40 * when it went into uninterruptible state and decrement on whatever CPU
41 * did the wakeup. This means that only the sum of nr_uninterruptible over
42 * all CPUs yields the correct result.

---
kernel/sched/core.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 25b582b..cd5ef6e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4068,6 +4068,7 @@ bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
{
unsigned long flags;
int cpu, success = 0;
+ struct rq *src_rq, *dst_rq;

preempt_disable();
if (p == current) {
@@ -4205,6 +4206,16 @@ bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
atomic_dec(&task_rq(p)->nr_iowait);
}

+ if (p->sched_contributes_to_load) {
+ src_rq = cpu_rq(task_cpu(p));
+ dst_rq = cpu_rq(cpu);
+
+ double_rq_lock(src_rq, dst_rq);
+ src_rq->nr_uninterruptible--;
+ dst_rq->nr_uninterruptible++;
+ double_rq_unlock(src_rq, dst_rq);
+ }
+
wake_flags |= WF_MIGRATED;
psi_ttwu_dequeue(p);
set_task_cpu(p, cpu);