Re: [PATCH v4] sched/fair: unlink misfit task from cpu overutilized

From: Dietmar Eggemann
Date: Tue Jan 31 2023 - 11:36:24 EST


On 27/01/2023 17:20, Vincent Guittot wrote:
> On Thu, 26 Jan 2023 at 12:42, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>>
>> On 19/01/2023 17:42, Vincent Guittot wrote:
>>> By taking into account uclamp_min, the 1:1 relation between task misfit
>>> and cpu overutilized is no more true as a task with a small util_avg may
>>> not fit a high capacity cpu because of uclamp_min constraint.
>>>
>>> Add a new state in util_fits_cpu() to reflect the case that task would fit
>>> a CPU except for the uclamp_min hint which is a performance requirement.
>>>
>>> Use -1 to reflect that a CPU doesn't fit only because of uclamp_min so we
>>> can use this new value to take additional action to select the best CPU
>>> that doesn't match uclamp_min hint.
>>>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
>>> ---
>>>
>>> Change since v3:
>>> - Keep current condition for uclamp_max_fits in util_fits_cpu()
>>> - Update some comments
>>
>> We had already this discussion whether this patch can also remove
>> Capacity Inversion (CapInv).
>>
>> After studying the code again, I'm not so sure anymore.
>>
>> This patch:
>>
>> (1) adds a dedicated return value (-1) to util_fits_cpu() when:
>>
>> `util fits 80% capacity_of() && util < uclamp_min && uclamp_min >
>> capacity_orig_thermal (region c)`
>>
>> (2) Enhancements to the CPU selection in sic() and feec() to cater for
>> this new return value.
>>
>> IMHO this doesn't make the intention of CapInv in util_fits_cpu()
>> obsolete, which is using:
>>
>> in CapInv:
>>
>> capacity_orig = capacity_orig_of() - thermal_load_avg
>> capacity_orig_thermal = capacity_orig_of() - thermal_load_avg
>>
>> not in CapInv:
>>
>> capacity_orig = capacity_orig_of()
>> capacity_orig_thermal = capacity_orig_of() - th_pressure
>>
>> Maybe I still miss a bit of the story?
>>
>> v3 hints to removing the bits in the next version:
>>
>> https://lkml.kernel.org/r/20230115001906.v7uq4ddodrbvye7d@airbuntu
>>
>>> kernel/sched/fair.c | 105 ++++++++++++++++++++++++++++++++++----------
>>> 1 file changed, 82 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index d4db72f8f84e..54e14da53274 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -4561,8 +4561,8 @@ static inline int util_fits_cpu(unsigned long util,
>>> * handle the case uclamp_min > uclamp_max.
>>> */
>>> uclamp_min = min(uclamp_min, uclamp_max);
>>> - if (util < uclamp_min && capacity_orig != SCHED_CAPACITY_SCALE)
>>> - fits = fits && (uclamp_min <= capacity_orig_thermal);
>>> + if (fits && (util < uclamp_min) && (uclamp_min > capacity_orig_thermal))
>>> + return -1;
>>
>> Or does the definition 'return -1 if util fits but uclamp doesn't' make
>> the distinction between capacity_orig and capacity_orig_thermal obsolete
>> and so CapInv?
>
> Yes, that's the key point. When it returns -1, we will continue to
> look for a possible cpu with better performance which replaces CapInv
> with capacity_orig_of() - thermal_load_avg to detect a capacity
> inversion.

I see.

Could you add this paragraph to the patch header so that we understand
this part of the intention of this change right away? I know you
mentioned this either in a conversation or in an email-thread somewhere
but I forgot about it in the meantime.

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>

[...]