Re: [PATCH v1] sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal

From: Mel Gorman
Date: Fri Nov 06 2020 - 07:03:25 EST


On Wed, Nov 04, 2020 at 09:42:05AM +0000, Mel Gorman wrote:
> While it's possible that some other factor masked the impact of the patch,
> the fact it's neutral for two workloads in 5.10-rc2 is suspicious as it
> indicates that if the patch was implemented against 5.10-rc2, it would
> likely not have been merged. I've queued the tests on the remaining
> machines to see if something more conclusive falls out.
>

It's not as conclusive as I would like. fork_test generally benefits
across the board but I do not put much weight in that.

Otherwise, it's workload and machine-specific.

schbench: (wakeup latency sensitive), all machines benefitted from the
revert at the low utilisation except one 2-socket haswell machine
which showed higher variability when the machine was fully
utilised.

hackbench: Neutral except for the same 2-socket Haswell machine which
took an 8% performance penalty of 8% for smaller number of groups
and 4% for higher number of groups.

pipetest: Mostly neutral except for the *same* machine showing an 18%
performance gain by reverting.

kernbench: Shows small gains at low job counts across the board -- 0.84%
lowest gain up to 5.93% depending on the machine

gitsource: low utilisation execution of the git test suite. This was
mostly a win for the revert. For the list of machines tested it was

14.48% gain (2 socket but SNC enabled to 4 NUMA nodes)
neutral (2 socket broadwell)
36.37% gain (1 socket skylake machine)
3.18% gain (2 socket broadwell)
4.4% (2 socket EPYC 2)
1.85% gain (2 socket EPYC 1)

While it was clear-cut for 5.9, it's less clear-cut for 5.10-rc2 although
the gitsource shows some severe differences depending on the machine that
is worth being extremely cautious about. I would still prefer a revert
but I'm also extremely biased and I know there are other patches in the
pipeline that may change the picture. A wider battery of tests might
paint a clearer picture but may not be worth the time investment.

So maybe lets just keep an eye on this one. When the scheduler pipeline
dies down a bit (does that happen?), we should at least revisit it.

--
Mel Gorman
SUSE Labs