Re: [PATCH] sched: Take thermal pressure into account when determine rt fits capacity

From: Qais Yousef
Date: Wed Apr 27 2022 - 07:04:04 EST


On 04/27/22 09:38, Xuewen Yan wrote:
> > > > The best (simplest) way forward IMHO is to introduce a new function
> > > >
> > > > bool cpu_in_capacity_inversion(int cpu);
> > > >
> > > > (feel free to pick another name) which will detect the scenario you're in. You
> > > > can use this function then in rt_task_fits_capacity()
> > > >
> > > > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > > > index a32c46889af8..d48811a7e956 100644
> > > > --- a/kernel/sched/rt.c
> > > > +++ b/kernel/sched/rt.c
> > > > @@ -462,6 +462,9 @@ static inline bool rt_task_fits_capacity(struct task_struct *p, int cpu)
> > > > if (!static_branch_unlikely(&sched_asym_cpucapacity))
> > > > return true;
> > > >
> > > > + if (cpu_in_capacity_inversion(cpu))
> > > > + return false;
> > > > +
> > > > min_cap = uclamp_eff_value(p, UCLAMP_MIN);
> > > > max_cap = uclamp_eff_value(p, UCLAMP_MAX);
> > > >
> > > > You'll probably need to do something similar in dl_task_fits_capacity().
> > > >
> > > > This might be a bit aggressive though as we'll steer away all RT tasks from
> > > > this CPU (as long as there's another CPU that can fit it). I need to think more
> > > > about it. But we could do something like this too
> > > >
> > > > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > > > index a32c46889af8..f2a34946a7ab 100644
> > > > --- a/kernel/sched/rt.c
> > > > +++ b/kernel/sched/rt.c
> > > > @@ -462,11 +462,14 @@ static inline bool rt_task_fits_capacity(struct task_struct *p, int cpu)
> > > > if (!static_branch_unlikely(&sched_asym_cpucapacity))
> > > > return true;
> > > >
> > > > + cpu_cap = capacity_orig_of(cpu);
> > > > +
> > > > + if (cpu_in_capacity_inversion(cpu))
> > >
> > > It's a good idea, but as you said, in mainline, the
> > > sysctl_sched_uclamp_util_min_rt_default is always 1024,
> > > Maybe it's better to add it to the judgment?
> >
> > I don't think so. If we want to handle finding the next best thing, we need to
> > make the search more complex than that. This is no worse than having 2 RT tasks
> > waking up at the same time while there's only a single big CPU. One of them
> > will end up on a medium or a little and we don't provide better guarantees
> > here.
>
> I may have misunderstood your patch before, do you mean this:
> 1. the cpu has to be inversion, if not, the cpu's capacity is still
> the biggest, although the sysctl_sched_uclamp_util_min_rt_default
> =1024, it still can put on the cpu.
> 2. If the cpu is inversion, the thermal pressure should be considered,
> at this time, if the sysctl_sched_uclamp_util_min_rt_default is not
> 1024, make the rt still have chance to select the cpu.
> If the sysctl_sched_uclamp_util_min_rt_default is 1024, all of the
> cpu actually can not fit the rt, at this time, select cpu without
> considering the cap_orig_of(cpu). The worst thing may be that rt
> would put on the small core.
>
> I understand right? If so, Perhaps this approach has the least impact
> on the current code complexity.

I believe you understood correctly. Tasks that need to run at 1024 when the
biggest cpu is in capacity inversion will get screwed - the system can't
satisfy their requirements. If they're happy to run on a medium (the next best
thing), then their uclamp_min should change to reflect that. If they are not
happy to run at the medium, then I'm not sure if it'll make much of
a difference if they end up on little. Their deadline will be missed anyway..

Again this is no worse than having two RT tasks with uclamp_min = 1024 waking
up at the same time on a system with 1 big cpu. Only one of them will be able
to run there.

I think tasks wanting 1024 is rare and no one seemed to bother with doing
better here so far. But we can certainly do better if need to :-)


Thanks

--
Qais Yousef