Re: [PATCH v5 2/5] sched/fair: Limited scan for idle cores when overloaded

From: Chen Yu
Date: Fri Sep 09 2022 - 05:33:24 EST


On 2022-09-09 at 13:53:01 +0800, Abel Wu wrote:
> The has_idle_cores hint could be misleading due to some kind of
> rapid idling workloads, especially when LLC is overloaded. If this
> is the case, then there will be some full scan cost incurred that
> often fails to find a core.
>
> So limit the scan depth for idle cores in such case to make a
> speculative inspection at a reasonable cost.
>
> Benchmark
> =========
>
> Tests are done in a dual socket (2 x 24C/48T) machine modeled Intel
> Xeon(R) Platinum 8260, with SNC configuration:
>
> SNC on: 4 NUMA nodes each of which has 12C/24T
> SNC off: 2 NUMA nodes each of which has 24C/48T
>
> All of the benchmarks are done inside a normal cpu cgroup in a clean
> environment with cpu turbo disabled.
>
> Based on tip sched/core 0fba527e959d (v5.19.0) plus previous patches
> of this series.
>
> Results
> =======
>
> hackbench-process-pipes
> unpatched patched
> (SNC on)
> Amean 1 0.4470 ( 0.00%) 0.4557 ( -1.94%)
> Amean 4 0.5947 ( 0.00%) 0.6033 ( -1.46%)
> Amean 7 0.7450 ( 0.00%) 0.7627 ( -2.37%)
> Amean 12 1.1053 ( 0.00%) 1.0653 ( 3.62%)
> Amean 21 1.9420 ( 0.00%) 2.0283 * -4.45%*
> Amean 30 2.9267 ( 0.00%) 2.9670 ( -1.38%)
> Amean 48 4.7027 ( 0.00%) 4.6863 ( 0.35%)
> Amean 79 7.7097 ( 0.00%) 7.9443 * -3.04%*
> Amean 110 10.0680 ( 0.00%) 10.2393 ( -1.70%)
> Amean 141 12.5450 ( 0.00%) 12.6343 ( -0.71%)
> Amean 172 15.0297 ( 0.00%) 14.9957 ( 0.23%)
> Amean 203 16.8827 ( 0.00%) 16.9133 ( -0.18%)
> Amean 234 19.1183 ( 0.00%) 19.2433 ( -0.65%)
> Amean 265 20.9893 ( 0.00%) 21.6917 ( -3.35%)
> Amean 296 23.3920 ( 0.00%) 23.8743 ( -2.06%)
> (SNC off)
> Amean 1 0.2717 ( 0.00%) 0.3143 ( -15.71%)
> Amean 4 0.6257 ( 0.00%) 0.6070 ( 2.98%)
> Amean 7 0.7740 ( 0.00%) 0.7960 ( -2.84%)
> Amean 12 1.2410 ( 0.00%) 1.1947 ( 3.73%)
> Amean 21 2.6410 ( 0.00%) 2.4837 ( 5.96%)
> Amean 30 3.7620 ( 0.00%) 3.4577 ( 8.09%)
> Amean 48 6.7757 ( 0.00%) 5.5227 * 18.49%*
> Amean 79 8.8827 ( 0.00%) 9.2933 ( -4.62%)
> Amean 110 11.0583 ( 0.00%) 11.0443 ( 0.13%)
> Amean 141 13.3387 ( 0.00%) 13.1360 ( 1.52%)
> Amean 172 15.9583 ( 0.00%) 15.7770 ( 1.14%)
> Amean 203 17.8757 ( 0.00%) 17.9557 ( -0.45%)
> Amean 234 20.0543 ( 0.00%) 20.4373 * -1.91%*
> Amean 265 22.6643 ( 0.00%) 23.6053 * -4.15%*
> Amean 296 25.6677 ( 0.00%) 25.6803 ( -0.05%)
>
> Run to run variations is large in the 1 group test, so can be safely
> ignored.
>
> With limited scan for idle cores when the LLC is overloaded, a slight
> regression can be seen on the smaller LLC machine. It is because the
> cost of full scan on these LLCs is much smaller than the machines with
> bigger LLCs. So when comes to the SNC off case, the limited scan can
> provide obvious benefit especially when the frequency of such scan is
> relatively high, e.g. <48 groups.
>
> It's not a universal win, but considering the LLCs are getting bigger
> nowadays, we should be careful on the scan depth and limited scan on
> certain circumstance is indeed necessary.
>
> tbench4 Throughput
> unpatched patched
> (SNC on)
> Hmean 1 309.43 ( 0.00%) 301.54 * -2.55%*
> Hmean 2 613.92 ( 0.00%) 607.81 * -0.99%*
> Hmean 4 1227.84 ( 0.00%) 1210.64 * -1.40%*
> Hmean 8 2379.04 ( 0.00%) 2381.73 * 0.11%*
> Hmean 16 4634.66 ( 0.00%) 4601.21 * -0.72%*
> Hmean 32 7592.12 ( 0.00%) 7626.84 * 0.46%*
> Hmean 64 9241.11 ( 0.00%) 9251.51 * 0.11%*
> Hmean 128 17870.37 ( 0.00%) 20620.98 * 15.39%*
> Hmean 256 19370.92 ( 0.00%) 20406.51 * 5.35%*
> Hmean 384 19413.92 ( 0.00%) 20312.97 * 4.63%*
> (SNC off)
> Hmean 1 287.90 ( 0.00%) 292.37 * 1.55%*
> Hmean 2 575.52 ( 0.00%) 583.29 * 1.35%*
> Hmean 4 1137.94 ( 0.00%) 1155.83 * 1.57%*
> Hmean 8 2250.42 ( 0.00%) 2297.63 * 2.10%*
> Hmean 16 4363.41 ( 0.00%) 4562.44 * 4.56%*
> Hmean 32 7338.06 ( 0.00%) 7425.69 * 1.19%*
> Hmean 64 8914.66 ( 0.00%) 9021.77 * 1.20%*
> Hmean 128 19978.59 ( 0.00%) 20257.76 * 1.40%*
> Hmean 256 20057.49 ( 0.00%) 20043.54 * -0.07%*
> Hmean 384 19846.74 ( 0.00%) 19528.03 * -1.61%*
>
> Conclusion
> ==========
>
> Limited scan for idle cores when LLC is overloaded is almost neutral
> compared to full scan given smaller LLCs, but is an obvious win at
> the bigger ones which are future-oriented.
>
> Suggested-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Abel Wu <wuyun.abel@xxxxxxxxxxxxx>
> ---
> kernel/sched/fair.c | 26 +++++++++++++++++++++-----
> 1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5af9bf246274..7abe188a1533 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6437,26 +6437,42 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
> time = cpu_clock(this);
> }
>
> - if (sched_feat(SIS_UTIL) && !has_idle_core) {
> + if (sched_feat(SIS_UTIL)) {
[1/5] patch added !has_idle_core, but this patch removes the check.
I'm trying to figure out the reason. Is it to better illustrating the
benchmark difference?

thanks,
Chenyu