Re: [PATCH 29/52] sched: Implement NUMA scanning backoff

From: Rik van Riel
Date: Mon Dec 03 2012 - 14:58:00 EST


On 12/02/2012 01:43 PM, Ingo Molnar wrote:
Back off slowly from scanning, up to sysctl_sched_numa_scan_period_max
(1.6 seconds). Scan faster again if we were forced to switch to
another node.

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8f0e6ba..59fea2e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -865,8 +865,10 @@ static void task_numa_placement(struct task_struct *p)
}
}

- if (max_node != p->numa_max_node)
+ if (max_node != p->numa_max_node) {
sched_setnuma(p, max_node, task_numa_shared(p));
+ goto out_backoff;
+ }

p->numa_migrate_seq++;
if (sched_feat(NUMA_SETTLE) &&

Is that correct?

It looks like the code only jumps to the out_backoff label
after resetting p->numa_scan_period to sysctl_sched_numa_scan_period_min
in sched_setnuma?

Should it not be the other way around, slowly increasing the process's
numa_scan_period when we do NOT do a sched_setnuma call for the process
at all?

@@ -882,7 +884,11 @@ static void task_numa_placement(struct task_struct *p)
if (shared != task_numa_shared(p)) {
sched_setnuma(p, p->numa_max_node, shared);
p->numa_migrate_seq = 0;
+ goto out_backoff;
}
+ return;

We can never reach the backoff code, except by an explicit goto,
which is only there after a call to sched_setnuma.

That is the opposite from what the changelog suggests...

+out_backoff:
+ p->numa_scan_period = min(p->numa_scan_period * 2, sysctl_sched_numa_scan_period_max);
}

/*


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/