[RFC 0/3] sched/topology: fix sched groups on NUMA machines with mesh topology

From: Lauro Ramos Venancio
Date: Thu Apr 13 2017 - 09:56:48 EST


Currently, the scheduler is not able to directly move tasks between some NUMA
nodes 2-hops apart on machines with mesh topology. This occurs because some
NUMA nodes belongs to all sched groups. For more details, see the patch 2
commit log.

This bug was reported in the paper [1] as "The Scheduling Group Construction
bug".

This patchset constructs the sched groups from each CPU perspective. So each
NUMA node can have different groups in the last NUMA sched domain level.

SPECjbb2005 results show up to 63% performance improvement and a huge standard
deviation drop on a machine with 8 NUMA nodes and mesh topology.

Patch 1 - just prepare the code for patch 2
Patch 2 - change the sched groups construction
Patch 3 - fix issue with different groups starting with the same CPU

[1] http://www.ece.ubc.ca/~sasha/papers/eurosys16-final29.pdf

Regards,
Lauro

Lauro Ramos Venancio (3):
sched/topology: Refactor function build_overlap_sched_groups()
sched/topology: fix sched groups on NUMA machines with mesh topology
sched/topology: Different sched groups must not have the same balance
cpu

kernel/sched/topology.c | 165 ++++++++++++++++++++++++++++++++++--------------
1 file changed, 117 insertions(+), 48 deletions(-)

--
1.8.3.1