Re: [PATCH] sched: Take thermal pressure into account when determine rt fits capacity

From: Qais Yousef
Date: Tue May 03 2022 - 10:44:00 EST


Hi Xuewen

On 05/01/22 11:20, Xuewen Yan wrote:
> Hi Qais
> Thanks for the patient explanation.:)
> And I have some other concerns.
>
> On Wed, Apr 27, 2022 at 6:58 PM Qais Yousef <qais.yousef@xxxxxxx> wrote:
> >
> > On 04/27/22 09:38, Xuewen Yan wrote:
> > > > > > The best (simplest) way forward IMHO is to introduce a new function
> > > > > >
> > > > > > bool cpu_in_capacity_inversion(int cpu);
>
> Maybe the implementation of this function, I have not thought of a
> good solution.
> (1)how to define the inversion, if the cpu has two
> cluster(little/big),it is easy, but still need mark which is the big
> cpu...

I'd define it as:

capacity_orig_of(cpu) - thermal_pressure(cpu) < capacity_orig_of(next_level_cpu)

> (2)because the mainline kernel should be common, if the cpu has three
> or more clusters, maybe the mid cpus also would be inversion;

Yes. I pray this is highly unlikely though! We should cater for it still.

> (3)what time update the cpu inversion state, if we judge the cpu
> inversion whenever the thermal pressure changed, could we receive the
> complexity? because may we should traverse all possible cpu.

In my head, it would make sense to detect the inversion in
update_cpu_capacity() OR in topology_update_thermal_pressure(). So at whatever
rate this happens at.

Does this answer your question?

Basically I believe something like this should be enough (completely untested)

--->8---


diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a68482d66535..44c7c2598d87 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8399,16 +8399,37 @@ static unsigned long scale_rt_capacity(int cpu)

static void update_cpu_capacity(struct sched_domain *sd, int cpu)
{
+ unsigned long capacity_orig = arch_scale_cpu_capacity(cpu);
unsigned long capacity = scale_rt_capacity(cpu);
struct sched_group *sdg = sd->groups;
+ struct rq *rq = cpu_rq(cpu);

- cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(cpu);
+ rq->cpu_capacity_orig = capacity_orig;

if (!capacity)
capacity = 1;

- cpu_rq(cpu)->cpu_capacity = capacity;
- trace_sched_cpu_capacity_tp(cpu_rq(cpu));
+ rq->cpu_capacity = capacity;
+ trace_sched_cpu_capacity_tp(rq);
+
+ if (static_branch_unlikely(&sched_asym_cpucapacity)) {
+ unsigned long inv_cap = capacity_orig - thermal_load_avg(rq);
+
+ rq->cpu_capacity_inverted = 0;
+
+ for_each_possible_cpu(cpu) {
+ unsigned long cap = arch_scale_cpu_capacity(cpu);
+
+ if (capacity_orig <= cap)
+ continue;
+
+ if (cap > inv_cap) {
+ rq->cpu_capacity_inverted = inv_cap;
+ break;
+ }
+ }
+
+ }

sdg->sgc->capacity = capacity;
sdg->sgc->min_capacity = capacity;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 8dccb34eb190..bfe84c870bf9 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -992,6 +992,7 @@ struct rq {

unsigned long cpu_capacity;
unsigned long cpu_capacity_orig;
+ unsigned long cpu_capacity_inverted;

struct callback_head *balance_callback;

@@ -2807,6 +2808,11 @@ static inline unsigned long capacity_orig_of(int cpu)
return cpu_rq(cpu)->cpu_capacity_orig;
}

+static inline unsigned long cpu_in_capacity_inversion(int cpu)
+{
+ return cpu_rq(cpu)->cpu_capacity_inverted;
+}
+
/**
* enum cpu_util_type - CPU utilization type
* @FREQUENCY_UTIL: Utilization used to select frequency


--->8---

Thanks

--
Qais Yousef