[PATCH] sched/fair: Add SMT4 group_smt_balance handling

From: Tim Chen
Date: Fri Jul 14 2023 - 19:09:30 EST


For SMT4, any group with more than 2 tasks will be marked as
group_smt_balance. Retain the behaviour of group_has_spare by marking
the busiest group as the group which has the least number of idle_cpus.

Also, handle rounding effect of adding (ncores_local + ncores_busy)
when the local is fully idle and busy group has more than 2 tasks.
Local group should try to pull at least 1 task in this case.

Fixes: fee1759e4f04 ("sched/fair: Determine active load balance for SMT sch=
ed groups")
Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
---
kernel/sched/fair.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0b7445cd5af9..6e7ee2efc1ba 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9575,7 +9575,7 @@ static inline long sibling_imbalance(struct lb_env *e=
nv,
imbalance /=3D ncores_local + ncores_busiest;
=20
/* Take advantage of resource in an empty sched group */
- if (imbalance =3D=3D 0 && local->sum_nr_running =3D=3D 0 &&
+ if (imbalance <=3D 1 && local->sum_nr_running =3D=3D 0 &&
busiest->sum_nr_running > 1)
imbalance =3D 2;
=20
@@ -9763,6 +9763,19 @@ static bool update_sd_pick_busiest(struct lb_env *en=
v,
break;
=20
case group_smt_balance:
+ /* no idle cpus on both groups handled by group_fully_busy below */
+ if (sgs->idle_cpus !=3D 0 || busiest->idle_cpus !=3D 0) {
+ if (sgs->idle_cpus > busiest->idle_cpus)
+ return false;
+ if (sgs->idle_cpus < busiest->idle_cpus)
+ return true;
+ if (sgs->sum_nr_running <=3D busiest->sum_nr_running)
+ return false;
+ else
+ return true;
+ }
+ goto fully_busy;
+
case group_fully_busy:
/*
* Select the fully busy group with highest avg_load. In
@@ -9775,7 +9788,7 @@ static bool update_sd_pick_busiest(struct lb_env *env=
,
* select the 1st one, except if @sg is composed of SMT
* siblings.
*/
-
+fully_busy:
if (sgs->avg_load < busiest->avg_load)
return false;
=20
--=20
2.32.0