Re: [PATCH V2 1/4] cpufreq: stats: Defer stats update to cpufreq_stats_record_transition()

From: Lukasz Luba
Date: Thu Sep 24 2020 - 12:10:33 EST




On 9/24/20 1:39 PM, Viresh Kumar wrote:
On 24-09-20, 13:07, Rafael J. Wysocki wrote:
On Thu, Sep 24, 2020 at 1:00 PM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:
On 9/24/20 11:24 AM, Rafael J. Wysocki wrote:
On Thu, Sep 24, 2020 at 11:25 AM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:

I wonder if we could just drop the reset feature. Is there a tool
which uses this file? The 'reset' sysfs would probably have to stay
forever, but an empty implementation is not an option?

Well, having an empty sysfs attr would be a bit ugly, but the
implementation of it could be simplified.

The documentation states:
'This can be useful for evaluating system behaviour under different
governors without the need for a reboot.'
With the scenario of fast-switch this resetting complicates the
implementation and the justification of having it just for experiments
avoiding reboot is IMO weak. The real production code would have to pay
extra cycles every time. Also, we would probably not experiment with
cpufreq different governors, since the SchedUtil is considered the best
option.

It would still be good to have a way to test it against the other
available options, though.


Experimenting with different governors would still be possible, just
the user-space would have to take a snapshot of the stats when switching
to a new governor. Then the values presented in the stats would just
need to be calculated in this user tool against the snapshot.

The resetting is also not that bad, since nowadays more components
maintain some kind of local statistics/history (scheduler, thermal).
I would recommend to reset the whole system and repeat the same tests
with different governor, just to be sure that everything starts from
similar state (utilization, temperature, other devfreq devices
frequencies etc).

Well, if everyone agrees on removing the reset feature, let's drop the
sysfs attr too, as it would be useless going forward.

Admittedly, I don't have a strong opinion and since intel_pstate
doesn't use a frequency table, this is not relevant for systems using
that driver anyway.

I added this file sometime back as it made my life a lot easier while testing
some scheduler related changes and see how they affect cpufreq updates. IMO this
is a useful feature and we don't really need to get rid of it.

Lets see where the discussion goes about the feedback you gave.


Because of supporting this reset file, the code is going to be a bit
complex and also visited from the scheduler. I don't know if the
config for stats is enabled for production kernels but if yes,
then forcing all to keep that reset code might be too much.
For the engineering kernel version is OK.

I would say for most normal checks these sysfs stats are very useful.
If there is a need for investigation like you described, the trace
event is there, just have to be enabled. Tools like LISA would
help with parsing the trace and mapping to some plots or even
merging with scheduler context.

From time to time some engineers are asking why the stats
don't show the values (missing fast-switch tracking). I think
they are interested in a simple use case, otherwise they would use the
tracing.

Regards,
Lukasz