Re: [PATCH] numa,sched: only consider less busy nodes as numa balancing destination

From: Artem Bityutskiy
Date: Mon May 11 2015 - 07:11:24 EST


On Fri, 2015-05-08 at 16:03 -0400, Rik van Riel wrote:
> This works well when dealing with tasks that are constantly
> running, but fails catastrophically when dealing with tasks
> that go to sleep, wake back up, go back to sleep, wake back
> up, and generally mess up the load statistics that the NUMA
> balancing code use in a random way.

Sleeping is what happens a lot I believe in this workload: processes do
a lot of network I/O, file I/O too, and a lot of IPC.

Would you please expand on this a bit more - why would this scenario
"mess up load statistics" ?

> If the normal scheduler load balancer is moving tasks the
> other way the NUMA balancer is moving them, things will
> not converge, and tasks will have worse memory locality
> than not doing NUMA balancing at all.

Are the regular and NUMA balancers independent?

Are there mechanisms to detect ping-pong situations? I'd like to verify
your theory, and these kinds of mechanisms would be helpful.

> Currently the load balancer has a preference for moving
> tasks to their preferred nodes (NUMA_FAVOUR_HIGHER, true),
> but there is no resistance to moving tasks away from their
> preferred nodes (NUMA_RESIST_LOWER, false). That setting
> was arrived at after a fair amount of experimenting, and
> is probably correct.

I guess I can try making NUMA_RESIST_LOWER to be true and see what
happens. But probably first I need to confirm that your theory
(balancers playing ping-pong) is correct, any hints on how would I do
this?

Thanks!

Artem.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/