Re: [Patch v9 7/8] sched/fair: Enable tuning of decay period

From: Thara Gopinath
Date: Tue Feb 18 2020 - 09:57:31 EST




On 2/14/20 5:26 AM, Dietmar Eggemann wrote:
On 13/02/2020 14:54, Thara Gopinath wrote:
On 02/10/2020 06:59 AM, Dietmar Eggemann wrote:
On 07/02/2020 23:42, Thara Gopinath wrote:
On 02/04/2020 03:39 AM, Dietmar Eggemann wrote:
On 03/02/2020 16:55, Peter Zijlstra wrote:
On Mon, Feb 03, 2020 at 07:07:57AM -0500, Thara Gopinath wrote:
On 01/28/2020 06:56 PM, Randy Dunlap wrote:
Hi,

On 1/28/20 2:36 PM, Thara Gopinath wrote:

[...]

is really not saying from which review comment the individual changes in
the function name are coming from. And I don't see an answer to Ionela's
email saying that her proposal will manifest in a particular part of
this change.
Hi Dietmar,

Like I said, don't want to argue on name. It is trivial for me. I have
v10 prepped with the name change. Will send it out shortly.

Thanks.

[...]

Cpu-invariant accounting can't be guarded with a kernel CONFIG switch.
Frequency-invariant accounting could be with CONFIG_CPU_FREQ but this is
enabled by default by Arm64 defconfig.
Thermal pressure (accounting) (CONFIG_HAVE_SCHED_THERMAL_PRESSURE) is
disabled by default so why should a per-cpu thermal_pressure be
maintained on such a system (CONFIG_CPU_THERMAL=y by default)?

I agree that there is no need for per-cpu thermal pressure to be
maintained if no averaging is happening in the scheduler, today. I don't
know if there will ever be an use for it.

All arch_scale_FOO() functions follow the approach to force the arch
(currently x86, arm, arm64) to do

#define arch_scale_FOO BAR

to enable the FOO functionality.

There is no direct link between consumer and provider here.

consumer (sched) -> arch <- provider (arch, counters, CPUfreq, CPU
cooling, etc.)

So IMHO, FOO=thermal_pressure should follow this design pattern too.

'thermal_pressure' would be the only one which can be disabled by a
kernel config switch at the consumer side.
IMHO, it doesn't make sense to have the provider operating in this case.

My issue has to do with using a config option meant for internal
scheduler code being used else where. To me, once this happens, the
entire work done to separate out reading and writing of instantaneous
thermal pressure to arch_topology makes no sense. We could have kept it
in scheduler itself.

You might see thermal_pressure more on the level of irq_load or
[rt/dl]_rq_load and that could be why we have a different opinion here?

Now rt_rq_load and dl_rq_load are scheduler internal providers and
irq_load is driven by 'irq_delta + steal' time (which is much closer to
the scheduler than thermal for instance).

In this case, thermal pressure is quite close to scheduler as it reduces the maximum capacity available per cpu and hence affects scheduler placement of tasks


My assumption is that we don't want a direct link between the scheduler
and e.g. a provider 'thermal'.

Exactly. Which is why the same CONFIG option should not be used between the provider and consumer.


Another way I think about this whole thermal pressure framework is that
it is the job of cooling device or cpufreq or any other entity to update
a throttle in maximum pressure to the scheduler. It should be
independent of what scheduler does with it. Scheduler can choose to
ignore it

--
Warm Regards
Thara