Re: [PATCH 00/10] sched: EEVDF using latency-nice

From: K Prateek Nayak
Date: Wed Mar 22 2023 - 02:49:40 EST


Hello Peter,

Leaving some results from my testing on a dual socket Zen3 machine
(2 x 64C/128T) below.

tl;dr

o I've not tested workloads with nice and latency nice yet focusing more
on the out of the box performance. No changes to sched_feat were made
for the same reason.

o Except for hackbench (m:n communication relationship), I do not see any
regression for other standard benchmarks (mostly 1:1 or 1:n) relation
when system is below fully loaded.

o At fully loaded scenario, schbench seems to be unhappy. Looking at the
data from /proc/<pid>/sched for the tasks with schedstats enabled,
there is an increase in number of context switches and the total wait
sum. When system is overloaded, things flip and the schbench tail
latency improves drastically. I suspect the involuntary
context-switches help workers make progress much sooner after wakeup
compared to tip thus leading to lower tail latency.

o For the same reason as above, tbench throughput takes a hit with
number of involuntary context-switches increasing drastically for the
tbench server. There is also an increase in wait sum noticed.

o Couple of real world workloads were also tested. DeathStarBench
throughput tanks much more with the updated version in your tree
compared to this series as is.
SpecJBB Max-jOPS sees large improvements but comes at a cost of
drop in Critical-jOPS signifying an increase in either wait time
or an increase in involuntary context-switches which can lead to
transactions taking longer to complete.

o Apart from DeathStarBench, the all the trends reported remain same
comparing the version in your tree and this series, as is, applied
on the same base kernel.

I'll leave the detailed results below and some limited analysis.

On 3/6/2023 6:55 PM, Peter Zijlstra wrote:
> Hi!
>
> Ever since looking at the latency-nice patches, I've wondered if EEVDF would
> not make more sense, and I did point Vincent at some older patches I had for
> that (which is here his augmented rbtree thing comes from).
>
> Also, since I really dislike the dual tree, I also figured we could dynamically
> switch between an augmented tree and not (and while I have code for that,
> that's not included in this posting because with the current results I don't
> think we actually need this).
>
> Anyway, since I'm somewhat under the weather, I spend last week desperately
> trying to connect a small cluster of neurons in defiance of the snot overlord
> and bring back the EEVDF patches from the dark crypts where they'd been
> gathering cobwebs for the past 13 odd years.
>
> By friday they worked well enough, and this morning (because obviously I forgot
> the weekend is ideal to run benchmarks) I ran a bunch of hackbenck, netperf,
> tbench and sysbench -- there's a bunch of wins and losses, but nothing that
> indicates a total fail.
>
> ( in fact, some of the schbench results seem to indicate EEVDF schedules a lot
> more consistent than CFS and has a bunch of latency wins )
>
> ( hackbench also doesn't show the augmented tree and generally more expensive
> pick to be a loss, in fact it shows a slight win here )
>
>
> hackbech load + cyclictest --policy other results:
>
>
> EEVDF CFS
>
> # Min Latencies: 00053
> LNICE(19) # Avg Latencies: 04350
> # Max Latencies: 76019
>
> # Min Latencies: 00052 00053
> LNICE(0) # Avg Latencies: 00690 00687
> # Max Latencies: 14145 13913
>
> # Min Latencies: 00019
> LNICE(-19) # Avg Latencies: 00261
> # Max Latencies: 05642
>

Following are the results from testing the series on a dual socket
Zen3 machine (2 x 64C/128T):

NPS Modes are used to logically divide single socket into
multiple NUMA region.
Following is the NUMA configuration for each NPS mode on the system:

NPS1: Each socket is a NUMA node.
Total 2 NUMA nodes in the dual socket machine.

Node 0: 0-63, 128-191
Node 1: 64-127, 192-255

NPS2: Each socket is further logically divided into 2 NUMA regions.
Total 4 NUMA nodes exist over 2 socket.

Node 0: 0-31, 128-159
Node 1: 32-63, 160-191
Node 2: 64-95, 192-223
Node 3: 96-127, 223-255

NPS4: Each socket is logically divided into 4 NUMA regions.
Total 8 NUMA nodes exist over 2 socket.

Node 0: 0-15, 128-143
Node 1: 16-31, 144-159
Node 2: 32-47, 160-175
Node 3: 48-63, 176-191
Node 4: 64-79, 192-207
Node 5: 80-95, 208-223
Node 6: 96-111, 223-231
Node 7: 112-127, 232-255

Kernel versions:
- tip: 6.2.0-rc6 tip sched/core
- eevdf: 6.2.0-rc6 tip sched/core
+ eevdf commits from your tree
(https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=sched/eevdf)

- eevdf prev: 6.2.0-rc6 tip sched/core + this series as is

When the testing started, the tip was at:
commit 7c4a5b89a0b5 "sched/rt: pick_next_rt_entity(): check list_entry"

Benchmark Results:

~~~~~~~~~~~~~
~ hackbench ~
~~~~~~~~~~~~~

o NPS1

Test: tip eevdf
1-groups: 4.63 (0.00 pct) 4.52 (2.37 pct)
2-groups: 4.42 (0.00 pct) 5.41 (-22.39 pct) *
4-groups: 4.21 (0.00 pct) 5.26 (-24.94 pct) *
8-groups: 4.95 (0.00 pct) 5.01 (-1.21 pct)
16-groups: 5.43 (0.00 pct) 6.24 (-14.91 pct) *

o NPS2

Test: tip eevdf
1-groups: 4.68 (0.00 pct) 4.56 (2.56 pct)
2-groups: 4.45 (0.00 pct) 5.19 (-16.62 pct) *
4-groups: 4.19 (0.00 pct) 4.53 (-8.11 pct) *
8-groups: 4.80 (0.00 pct) 4.81 (-0.20 pct)
16-groups: 5.60 (0.00 pct) 6.22 (-11.07 pct) *

o NPS4

Test: tip eevdf
1-groups: 4.68 (0.00 pct) 4.57 (2.35 pct)
2-groups: 4.56 (0.00 pct) 5.19 (-13.81 pct) *
4-groups: 4.50 (0.00 pct) 4.96 (-10.22 pct) *
8-groups: 5.76 (0.00 pct) 5.49 (4.68 pct)
16-groups: 5.60 (0.00 pct) 6.53 (-16.60 pct) *

~~~~~~~~~~~~
~ schbench ~
~~~~~~~~~~~~

o NPS1

#workers: tip eevdf
1: 36.00 (0.00 pct) 36.00 (0.00 pct)
2: 37.00 (0.00 pct) 37.00 (0.00 pct)
4: 38.00 (0.00 pct) 39.00 (-2.63 pct)
8: 52.00 (0.00 pct) 50.00 (3.84 pct)
16: 66.00 (0.00 pct) 68.00 (-3.03 pct)
32: 111.00 (0.00 pct) 109.00 (1.80 pct)
64: 213.00 (0.00 pct) 212.00 (0.46 pct)
128: 502.00 (0.00 pct) 637.00 (-26.89 pct) *
256: 45632.00 (0.00 pct) 24992.00 (45.23 pct) ^
512: 78720.00 (0.00 pct) 44096.00 (43.98 pct) ^

o NPS2

#workers: tip eevdf
1: 31.00 (0.00 pct) 23.00 (25.80 pct)
2: 32.00 (0.00 pct) 33.00 (-3.12 pct)
4: 39.00 (0.00 pct) 37.00 (5.12 pct)
8: 52.00 (0.00 pct) 49.00 (5.76 pct)
16: 67.00 (0.00 pct) 68.00 (-1.49 pct)
32: 113.00 (0.00 pct) 112.00 (0.88 pct)
64: 213.00 (0.00 pct) 214.00 (-0.46 pct)
128: 508.00 (0.00 pct) 491.00 (3.34 pct)
256: 46912.00 (0.00 pct) 22304.00 (52.45 pct) ^
512: 76672.00 (0.00 pct) 42944.00 (43.98 pct) ^

o NPS4

#workers: tip eevdf
1: 33.00 (0.00 pct) 30.00 (9.09 pct)
2: 40.00 (0.00 pct) 36.00 (10.00 pct)
4: 44.00 (0.00 pct) 41.00 (6.81 pct)
8: 73.00 (0.00 pct) 73.00 (0.00 pct)
16: 71.00 (0.00 pct) 71.00 (0.00 pct)
32: 111.00 (0.00 pct) 115.00 (-3.60 pct)
64: 217.00 (0.00 pct) 211.00 (2.76 pct)
128: 509.00 (0.00 pct) 553.00 (-8.64 pct) *
256: 44352.00 (0.00 pct) 26848.00 (39.46 pct) ^
512: 75392.00 (0.00 pct) 44352.00 (41.17 pct) ^


~~~~~~~~~~
~ tbench ~
~~~~~~~~~~

o NPS1

Clients: tip eevdf
1 483.10 (0.00 pct) 476.46 (-1.37 pct)
2 956.03 (0.00 pct) 943.12 (-1.35 pct)
4 1786.36 (0.00 pct) 1760.64 (-1.43 pct)
8 3304.47 (0.00 pct) 3105.19 (-6.03 pct)
16 5440.44 (0.00 pct) 5609.24 (3.10 pct)
32 10462.02 (0.00 pct) 10416.02 (-0.43 pct)
64 18995.99 (0.00 pct) 19317.34 (1.69 pct)
128 27896.44 (0.00 pct) 28459.38 (2.01 pct)
256 49742.89 (0.00 pct) 46371.44 (-6.77 pct) *
512 49583.01 (0.00 pct) 45717.22 (-7.79 pct) *
1024 48467.75 (0.00 pct) 43475.31 (-10.30 pct) *

o NPS2

Clients: tip eevdf
1 472.57 (0.00 pct) 475.35 (0.58 pct)
2 938.27 (0.00 pct) 942.19 (0.41 pct)
4 1764.34 (0.00 pct) 1783.50 (1.08 pct)
8 3043.57 (0.00 pct) 3205.85 (5.33 pct)
16 5103.53 (0.00 pct) 5154.94 (1.00 pct)
32 9767.22 (0.00 pct) 9793.81 (0.27 pct)
64 18712.65 (0.00 pct) 18601.10 (-0.59 pct)
128 27691.95 (0.00 pct) 27542.57 (-0.53 pct)
256 47939.24 (0.00 pct) 43401.62 (-9.46 pct) *
512 47843.70 (0.00 pct) 43971.16 (-8.09 pct) *
1024 48412.05 (0.00 pct) 42808.58 (-11.57 pct) *

o NPS4

Clients: tip eevdf
1 486.74 (0.00 pct) 484.88 (-0.38 pct)
2 950.50 (0.00 pct) 950.04 (-0.04 pct)
4 1778.58 (0.00 pct) 1796.03 (0.98 pct)
8 3106.36 (0.00 pct) 3180.09 (2.37 pct)
16 5139.81 (0.00 pct) 5139.50 (0.00 pct)
32 9911.04 (0.00 pct) 10086.37 (1.76 pct)
64 18201.46 (0.00 pct) 18289.40 (0.48 pct)
128 27284.67 (0.00 pct) 26947.19 (-1.23 pct)
256 46793.72 (0.00 pct) 43971.87 (-6.03 pct) *
512 48841.96 (0.00 pct) 44255.01 (-9.39 pct) *
1024 48811.99 (0.00 pct) 43118.99 (-11.66 pct) *

~~~~~~~~~~
~ stream ~
~~~~~~~~~~

o NPS1

- 10 Runs:

Test: tip eevdf
Copy: 321229.54 (0.00 pct) 332975.45 (3.65 pct)
Scale: 207471.32 (0.00 pct) 212534.83 (2.44 pct)
Add: 234962.15 (0.00 pct) 243011.39 (3.42 pct)
Triad: 246256.00 (0.00 pct) 256453.73 (4.14 pct)

- 100 Runs:

Test: tip eevdf
Copy: 332714.94 (0.00 pct) 333183.42 (0.14 pct)
Scale: 216140.84 (0.00 pct) 212160.53 (-1.84 pct)
Add: 239605.00 (0.00 pct) 233168.69 (-2.68 pct)
Triad: 258580.84 (0.00 pct) 256972.33 (-0.62 pct)

o NPS2

- 10 Runs:

Test: tip eevdf
Copy: 324423.92 (0.00 pct) 340685.20 (5.01 pct)
Scale: 215993.56 (0.00 pct) 217895.31 (0.88 pct)
Add: 250590.28 (0.00 pct) 257495.12 (2.75 pct)
Triad: 261284.44 (0.00 pct) 261373.49 (0.03 pct)

- 100 Runs:

Test: tip eevdf
Copy: 325993.72 (0.00 pct) 341244.18 (4.67 pct)
Scale: 227201.27 (0.00 pct) 227255.98 (0.02 pct)
Add: 256601.84 (0.00 pct) 258026.75 (0.55 pct)
Triad: 260222.19 (0.00 pct) 269878.75 (3.71 pct)

o NPS4

- 10 Runs:

Test: tip eevdf
Copy: 356850.80 (0.00 pct) 371230.27 (4.02 pct)
Scale: 247219.39 (0.00 pct) 237846.20 (-3.79 pct)
Add: 268588.78 (0.00 pct) 261088.54 (-2.79 pct)
Triad: 272932.59 (0.00 pct) 284068.07 (4.07 pct)

- 100 Runs:

Test: tip eevdf
Copy: 365965.18 (0.00 pct) 371186.97 (1.42 pct)
Scale: 246068.58 (0.00 pct) 245991.10 (-0.03 pct)
Add: 263677.73 (0.00 pct) 269021.14 (2.02 pct)
Triad: 273701.36 (0.00 pct) 280566.44 (2.50 pct)

~~~~~~~~~~~~~
~ Unixbench ~
~~~~~~~~~~~~~

o NPS1

Test Metric Parallelism tip eevdf
unixbench-dhry2reg Hmean unixbench-dhry2reg-1 49077561.21 ( 0.00%) 49144835.64 ( 0.14%)
unixbench-dhry2reg Hmean unixbench-dhry2reg-512 6285373890.61 ( 0.00%) 6270537933.92 ( -0.24%)
unixbench-syscall Amean unixbench-syscall-1 2664815.40 ( 0.00%) 2679289.17 * -0.54%*
unixbench-syscall Amean unixbench-syscall-512 7848462.70 ( 0.00%) 7456802.37 * 4.99%*
unixbench-pipe Hmean unixbench-pipe-1 2531131.89 ( 0.00%) 2475863.05 * -2.18%*
unixbench-pipe Hmean unixbench-pipe-512 305171024.40 ( 0.00%) 301182156.60 ( -1.31%)
unixbench-spawn Hmean unixbench-spawn-1 4058.05 ( 0.00%) 4284.38 * 5.58%*
unixbench-spawn Hmean unixbench-spawn-512 79893.24 ( 0.00%) 78234.45 * -2.08%*
unixbench-execl Hmean unixbench-execl-1 4148.64 ( 0.00%) 4086.73 * -1.49%*
unixbench-execl Hmean unixbench-execl-512 11077.20 ( 0.00%) 11137.79 ( 0.55%)

o NPS2

Test Metric Parallelism tip eevdf
unixbench-dhry2reg Hmean unixbench-dhry2reg-1 49394822.56 ( 0.00%) 49175574.26 ( -0.44%)
unixbench-dhry2reg Hmean unixbench-dhry2reg-512 6267817215.36 ( 0.00%) 6282838979.08 * 0.24%*
unixbench-syscall Amean unixbench-syscall-1 2663675.03 ( 0.00%) 2677018.53 * -0.50%*
unixbench-syscall Amean unixbench-syscall-512 7342392.90 ( 0.00%) 7443264.00 * -1.37%*
unixbench-pipe Hmean unixbench-pipe-1 2533194.04 ( 0.00%) 2475969.01 * -2.26%*
unixbench-pipe Hmean unixbench-pipe-512 303588239.03 ( 0.00%) 302217597.98 * -0.45%*
unixbench-spawn Hmean unixbench-spawn-1 5141.40 ( 0.00%) 4862.78 ( -5.42%) *
unixbench-spawn Hmean unixbench-spawn-512 82993.79 ( 0.00%) 79139.42 * -4.64%* *
unixbench-execl Hmean unixbench-execl-1 4140.15 ( 0.00%) 4084.20 * -1.35%*
unixbench-execl Hmean unixbench-execl-512 12229.25 ( 0.00%) 11445.22 ( -6.41%) *

o NPS4

Test Metric Parallelism tip eevdf
unixbench-dhry2reg Hmean unixbench-dhry2reg-1 48970677.27 ( 0.00%) 49070289.56 ( 0.20%)
unixbench-dhry2reg Hmean unixbench-dhry2reg-512 6297506696.81 ( 0.00%) 6311038905.07 ( 0.21%)
unixbench-syscall Amean unixbench-syscall-1 2664715.13 ( 0.00%) 2677752.20 * -0.49%*
unixbench-syscall Amean unixbench-syscall-512 7938670.70 ( 0.00%) 7972291.60 ( -0.42%)
unixbench-pipe Hmean unixbench-pipe-1 2527605.54 ( 0.00%) 2476140.77 * -2.04%*
unixbench-pipe Hmean unixbench-pipe-512 305068507.23 ( 0.00%) 304114548.50 ( -0.31%)
unixbench-spawn Hmean unixbench-spawn-1 5207.34 ( 0.00%) 4964.39 ( -4.67%) *
unixbench-spawn Hmean unixbench-spawn-512 81352.38 ( 0.00%) 74467.00 * -8.46%* *
unixbench-execl Hmean unixbench-execl-1 4131.37 ( 0.00%) 4044.09 * -2.11%*
unixbench-execl Hmean unixbench-execl-512 13025.56 ( 0.00%) 11124.77 * -14.59%* *

~~~~~~~~~~~
~ netperf ~
~~~~~~~~~~~

o NPS1

tip eevdf
1-clients: 107932.22 (0.00 pct) 106167.39 (-1.63 pct)
2-clients: 106887.99 (0.00 pct) 105304.25 (-1.48 pct)
4-clients: 106676.11 (0.00 pct) 104328.10 (-2.20 pct)
8-clients: 98645.45 (0.00 pct) 94076.26 (-4.63 pct)
16-clients: 88881.23 (0.00 pct) 86831.85 (-2.30 pct)
32-clients: 86654.28 (0.00 pct) 86313.80 (-0.39 pct)
64-clients: 81431.90 (0.00 pct) 74885.75 (-8.03 pct)
128-clients: 55993.77 (0.00 pct) 55378.10 (-1.09 pct)
256-clients: 43865.59 (0.00 pct) 44326.30 (1.05 pct)

o NPS2

tip eevdf
1-clients: 106711.81 (0.00 pct) 108576.27 (1.74 pct)
2-clients: 106987.79 (0.00 pct) 108348.24 (1.27 pct)
4-clients: 105275.37 (0.00 pct) 105702.12 (0.40 pct)
8-clients: 103028.31 (0.00 pct) 96250.20 (-6.57 pct)
16-clients: 87382.43 (0.00 pct) 87683.29 (0.34 pct)
32-clients: 86578.14 (0.00 pct) 86968.29 (0.45 pct)
64-clients: 81470.63 (0.00 pct) 75906.15 (-6.83 pct)
128-clients: 54803.35 (0.00 pct) 55051.90 (0.45 pct)
256-clients: 42910.29 (0.00 pct) 44062.33 (2.68 pct)

~~~~~~~~~~~
~ SpecJBB ~
~~~~~~~~~~~

o NPS1

tip eevdf
Max-jOPS 100% 115.71% (+15.71%) ^
Critical-jOPS 100% 93.59% (-6.41%) *

~~~~~~~~~~~~~~~~~~
~ DeathStarBench ~
~~~~~~~~~~~~~~~~~~

o NPS1

#CCX 1 CCX 2 CCX 3 CCX 4 CCX
o eevdf compared to tip -10.93 -14.35 -9.74 -6.07
o eevdf prev (this sries as is)
compared to tip -1.99 -6.64 -4.99 -3.87

Note: #CCX is the number of LLCs the the services are pinned to.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ Some Preliminary Analysis ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

tl;dr

- There seems to be an increase in number of involuntary context switches
when the system is overloaded. This probably allows newly waking task to
make progress benefiting latency sensitive workload like schbench in
overloaded scenario compared to tip but hurts tbench performance.
When system is fully loaded, the larger average wait time seems to hurt
the schbench performance.
More analysis is needed to get to the bottom of the problem.

- For hackbench 2 groups scenario, there seems the wait time seems to go up
drastically.

Scheduler statistics of interest are listed in detail below.

Note: Units of all metrics denoting time is ms. They are processed from
per-task schedstats in /proc/<pid>/sched.

o Hackbench (2 Groups) (NPS1)

tip eevdf %diff
Comm sched-messaging sched-messaging N/A
Sum of avg_atom 282.0024818 19.04355233 -93.24702669
Average of avg_atom 3.481512121 0.235105584 -93.24702669
Sum of avg_per_cpu 1761.949461 61.52537145 -96.50810805
Average of avg_per_cpu 21.75246248 0.759572487 -96.50810805
Average of avg_wait_time 0.007239228 0.012899105 78.18343632
Sum of nr_switches 4897740 4728784 -3.449672706
Sum of nr_voluntary_switches 4742512 4621606 -2.549408415
Sum of nr_involuntary_switches 155228 107178 -30.95446698
Sum of nr_wakeups 4742648 4623175 -2.51912012
Sum of nr_migrations 1263925 930600 -26.37221354
Sum of sum_exec_runtime 288481.15 262255.2574 -9.091024712
Sum of sum_idle_runtime 2576164.568 2851759.68 10.69788457
Sum of sum_sleep_runtime 76890.14753 78632.31679 2.265789982
Sum of wait_count 4897894 4728939 -3.449543824
Sum of wait_sum 3041.78227 24167.4694 694.5167422

o schbench (2 messengers, 128 workers - fully loaded) (NPS1)

tip eevdf %diff
Comm schbench schbench N/A
Sum of avg_atom 7538.162897 7289.565705 -3.297848503
Average of avg_atom 29.10487605 28.14504133 -3.297848503
Sum of avg_per_cpu 630248.6079 471215.3671 -25.23341406
Average of avg_per_cpu 2433.392309 1819.364352 -25.23341406
Average of avg_wait_time 0.054147456 25.34304285 46703.75524
Sum of nr_switches 85210 88176 3.480812111
Sum of nr_voluntary_switches 83165 83457 0.351109241
Sum of nr_involuntary_switches 2045 4719 130.7579462
Sum of nr_wakeups 83168 83459 0.34989419
Sum of nr_migrations 3265 3025 -7.350689127
Sum of sum_exec_runtime 2476504.52 2469058.164 -0.300680129
Sum of sum_idle_runtime 110294825.8 132520924.2 20.15153321
Sum of sum_sleep_runtime 5293337.741 5297778.714 0.083897408
Sum of sum_block_runtime 56.043253 15.12936 -73.00413664
Sum of wait_count 85615 88606 3.493546692
Sum of wait_sum 4653.340163 9605.221964 106.4156418

o schbench (2 messengers, 256 workers - overloaded) (NPS1)

tip eevdf %diff
Comm schbench schbench N/A
Sum of avg_atom 11676.77306 4803.485728 -58.8629007
Average of avg_atom 22.67334574 9.327156753 -58.8629007
Sum of avg_per_cpu 55235.68013 38286.47722 -30.68524343
Average of avg_per_cpu 107.2537478 74.34267421 -30.68524343
Average of avg_wait_time 2.23189096 2.58191945 15.68304621
Sum of nr_switches 202862 425258 109.6292061
Sum of nr_voluntary_switches 163079 165058 1.213522281
Sum of nr_involuntary_switches 39783 260200 554.0482115
Sum of nr_wakeups 163082 165058 1.211660392
Sum of nr_migrations 44199 54894 24.19738003
Sum of sum_exec_runtime 4586675.667 3963846.024 -13.57910801
Sum of sum_idle_runtime 201050644.2 195126863.7 -2.946412087
Sum of sum_sleep_runtime 10418117.66 10402686.4 -0.148119407
Sum of sum_block_runtime 1548.979156 516.115078 -66.68030838
Sum of wait_count 203377 425792 109.3609405
Sum of wait_sum 455609.3122 1100885.201 141.6292142

o tbench (256 clients - overloaded) (NPS1)

- tbench client
tip eevdf % diff
comm tbench tbench N/A
Sum of avg_atom 3.594587941 5.112101854 42.21663064
Average of avg_atom 0.013986724 0.019891447 42.21663064
Sum of avg_per_cpu 392838.0975 142065.4206 -63.83613975
Average of avg_per_cpu 1528.552909 552.7837377 -63.83613975
Average of avg_wait_time 0.010512441 0.006861579 -34.72895916
Sum of nr_switches 692845080 511780111 -26.1335433
Sum of nr_voluntary_switches 178151085 371234907 108.3820635
Sum of nr_involuntary_switches 514693995 140545204 -72.69344399
Sum of nr_wakeups 178151085 371234909 108.3820646
Sum of nr_migrations 45279 71177 57.19649286
Sum of sum_exec_runtime 9192343.465 9624025.792 4.69610746
Sum of sum_idle_runtime 7125370.721 16145736.39 126.5950365
Sum of sum_sleep_runtime 2222469.726 5792868.629 160.650058
Sum of sum_block_runtime 68.60879 446.080476 550.1797743
Sum of wait_count 692845479 511780543 -26.13352349
Sum of wait_sum 7287852.246 3297894.139 -54.7480653

- tbench server

tip eevdf % diff
Comm tbench_srv tbench_srv N/A
Sum of avg_atom 5.077837807 5.447267364 7.275331971
Average of avg_atom2 0.019758124 0.021195593 7.275331971
Sum of avg_per_cpu 538586.1634 87925.51225 -83.67475471
Average of avg_per_cpu2 2095.666006 342.1226158 -83.67475471
Average of avg_wait_time 0.000827346 0.006505748 686.3392261
Sum of nr_switches 692980666 511838912 -26.13951051
Sum of nr_voluntary_switches 690367607 390304935 -43.46418762
Sum of nr_involuntary_switches 2613059 121533977 4551.023073
Sum of nr_wakeups 690367607 390304935 -43.46418762
Sum of nr_migrations 39486 84474 113.9340526
Sum of sum_exec_runtime 9176708.278 8734423.401 -4.819646259
Sum of sum_idle_runtime 413900.3645 447180.3879 8.040588086
Sum of sum_sleep_runtime 8966201.976 6690818.107 -25.37734345
Sum of sum_block_runtime 1.776413 1.617435 -8.949382829
Sum of wait_count 692980942 511839229 -26.13949418
Sum of wait_sum 565739.6984 3295519.077 482.5150836

>
> The nice -19 numbers aren't as pretty as Vincent's, but at the end I was going
> cross-eyed from staring at tree prints and I just couldn't figure out where it
> was going side-ways.
>
> There's definitely more benchmarking/tweaking to be done (0-day already
> reported a stress-ng loss), but if we can pull this off we can delete a whole
> much of icky heuristics code. EEVDF is a much better defined policy than what
> we currently have.
>

DeathStarBench and SpecJBB and slightly more complex to analyze. I'll
get the schedstat data for both soon. I'll rerun some of the above
workloads with NO_PRESERVE_LAG to see if that makes any difference.
In the meantime, if you need more data from the test system for any
particular workload, please do let me know. I will collect the per-task
and system-wide schedstat data for the workload as it is rather
inexpensive to collect and gives good insights but if you need any
other data, I'll be more than happy to get those too for analysis.

--
Thanks and Regards,
Prateek