[PATCH v4 0/6] sched: use runnable load based balance

From: Alex Shi
Date: Sat Apr 27 2013 - 01:27:24 EST


This patchset bases on tip/sched/core.

The patchset remove the burst wakeup detection which had worked fine on 3.8
kernel, since the aim7 is very imbalance. But rwsem write lock stealing
enabled in 3.9 kernel. aim7 imbalance disappeared. So the burst wakeup
care doesn't needed.

It was tested on Intel core2, NHM, SNB, IVB, 2 and 4 sockets machines with
benchmark kbuild, aim7, dbench, tbench, hackbench, fileio-cfq(sysbench)

On SNB EP 4 sockets machine, the hackbench increased about 50%, and result
become stable. on other machines, hackbench increased about 2~5%.
no clear performance change on other benchmarks.

and Michael Wang had tested the pgbench on his box:
https://lkml.org/lkml/2013/4/2/1022
---
Done, here the results of pgbench without the last patch on my box:

| db_size | clients | tps | | tps |
+---------+---------+-------+ +-------+
| 22 MB | 1 | 10662 | | 10679 |
| 22 MB | 2 | 21483 | | 21471 |
| 22 MB | 4 | 42046 | | 41957 |
| 22 MB | 8 | 55807 | | 55684 |
| 22 MB | 12 | 50768 | | 52074 |
| 22 MB | 16 | 49880 | | 52879 |
| 22 MB | 24 | 45904 | | 53406 |
| 22 MB | 32 | 43420 | | 54088 | +24.57%
| 7484 MB | 1 | 7965 | | 7725 |
| 7484 MB | 2 | 19354 | | 19405 |
| 7484 MB | 4 | 37552 | | 37246 |
| 7484 MB | 8 | 48655 | | 50613 |
| 7484 MB | 12 | 45778 | | 47639 |
| 7484 MB | 16 | 45659 | | 48707 |
| 7484 MB | 24 | 42192 | | 46469 |
| 7484 MB | 32 | 36385 | | 46346 | +27.38%
| 15 GB | 1 | 7677 | | 7727 |
| 15 GB | 2 | 19227 | | 19199 |
| 15 GB | 4 | 37335 | | 37372 |
| 15 GB | 8 | 48130 | | 50333 |
| 15 GB | 12 | 45393 | | 47590 |
| 15 GB | 16 | 45110 | | 48091 |
| 15 GB | 24 | 41415 | | 47415 |
| 15 GB | 32 | 35988 | | 45749 | +27.12%
---

and also tested by morten.rasmussen@xxxxxxx
http://comments.gmane.org/gmane.linux.kernel/1463371
---
The patches are based in 3.9-rc2 and have been tested on an ARM vexpress TC2
big.LITTLE testchip containing five cpus: 2xCortex-A15 + 3xCortex-A7.
Additional testing and refinements might be needed later as more sophisticated
platforms become available.

cpu_power A15: 1441
cpu_power A7: 606

Benchmarks:
cyclictest: cyclictest -a -t 2 -n -D 10
hackbench: hackbench (default settings)
sysbench_1t: sysbench --test=cpu --num-threads=1 --max-requests=1000 run
sysbench_2t: sysbench --test=cpu --num-threads=2 --max-requests=1000 run
sysbench_5t: sysbench --test=cpu --num-threads=5 --max-requests=1000 run


Mixed cpu_power:
Average times over 20 runs normalized to 3.9-rc2 (lower is better):
3.9-rc2 +shi +shi+patches Improvement
cyclictest
AVG 74.9 74.5 75.75 -1.13%
MIN 69 69 69
MAX 88 88 94
hackbench
AVG 2.17 2.09 2.09 3.90%
MIN 2.10 1.95 2.02
MAX 2.25 2.48 2.17
sysbench_1t
AVG 25.13* 16.47' 16.48 34.43%
MIN 16.47 16.47 16.47
MAX 33.78 16.48 16.54
sysbench_2t
AVG 19.32 18.19 16.51 14.55%
MIN 16.48 16.47 16.47
MAX 22.15 22.19 16.61
sysbench_5t
AVG 27.22 27.71 24.14 11.31%
MIN 25.42 27.66 24.04
MAX 27.75 27.86 24.31

* The unpatched 3.9-rc2 scheduler gives inconsistent performance as tasks may
randomly be placed on either A7 or A15 cores. The max/min values reflects this
behaviour. A15 and A7 performance are ~16.5 and ~33.5 respectively.

' While Alex Shi's patches appear to solve the performance inconsistency for
sysbench_1t, it is not the true picture for all workloads. This can be seen for
sysbench_2t.

To ensure that the proposed changes does not affect normal SMP systems, the
same benchmarks have been run on a 2xCortex-A15 configuration as well:

SMP:
Average times over 20 runs normalized to 3.9-rc2 (lower is better):
3.9-rc2 +shi +shi+patches Improvement
cyclictest
AVG 78.6 75.3 77.6 1.34%
MIN 69 69 69
MAX 135 98 125
hackbench
AVG 3.55 3.54 3.55 0.06%
MIN 3.51 3.48 3.49
MAX 3.66 3.65 3.67
sysbench_1t
AVG 16.48 16.48 16.48 -0.03%
MIN 16.47 16.48 16.48
MAX 16.49 16.48 16.48
sysbench_2t
AVG 16.53 16.53 16.54 -0.05%
MIN 16.47 16.47 16.48
MAX 16.59 16.57 16.59
sysbench_5t
AVG 41.16 41.15 41.15 0.04%
MIN 41.14 41.13 41.11
MAX 41.35 41.19 41.17
---

Peter,
Would you like to consider pick up the patchset? or give some comments? :)

Best regards
Alex

[PATCH v4 1/6] Revert "sched: Introduce temporary FAIR_GROUP_SCHED
[PATCH v4 2/6] sched: set initial value of runnable avg for new
[PATCH v4 3/6] sched: update cpu load after task_tick.
[PATCH v4 4/6] sched: compute runnable load avg in cpu_load and
[PATCH v4 5/6] sched: consider runnable load average in move_tasks
[PATCH v4 6/6] sched: consider runnable load average in
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/