Re: [RFC v4 1/1] selftests/cpuidle: Add support for cpuidle latency measurement

From: Pratik Sampat
Date: Thu Sep 03 2020 - 08:29:42 EST


Hello Artem,

On 02/09/20 8:55 pm, Artem Bityutskiy wrote:
On Wed, 2020-09-02 at 17:15 +0530, Pratik Rajesh Sampat wrote:
Measure cpuidle latencies on wakeup to determine and compare with the
advertsied wakeup latencies for each idle state.

Thank you for pointing me to your talk. It was very interesting!
I certainly did not know about that the Intel architecture being aware
of timers and pre-wakes the CPUs which makes the timer experiment
observations void.

It looks like the measurements include more than just C-state wake,
they also include the overhead of waking up the proces, context switch,
and potentially any interrupts that happen on that CPU. I am not saying
this is not interesting data, it surely is, but it is going to be
larger than you see in cpuidle latency tables. Potentially
significantly larger.

The measurements will definitely include overhead than just the C-State
wakeup.

However, we are also collecting a baseline measurement wherein we run
the same test on a 100% busy CPU and the measurement of latency from
that could be considered to the kernel-userspace overhead.
The rest of the measurements would be considered keeping this baseline
in mind.

Therefore, I am not sure this program should be advertised as "cpuidle
measurement". It really measures the "IPI latency" in case of the IPI
method.

Now with the new found knowledge of timers in Intel, I understand that
this really only seems to measure IPI latency and not timer latency,
although both the observations shouldn't be too far off anyways.

A baseline measurement for each case of IPI and timers is taken at
100 percent CPU usage to quantify for the kernel-userpsace overhead
during execution.
At least on Intel platforms, this will mean that the IPI method won't
cover deep C-states like, say, PC6, because one CPU is busy. Again, not
saying this is not interesting, just pointing out the limitation.

That's a valid point. We have similar deep idle states in POWER too.
The idea here is that this test should be run on an already idle
system, of course there will be kernel jitters along the way
which can cause little skewness in observations across some CPUs but I
believe the observations overall should be stable.

Another solution to this could be using isolcpus, but that just
increases the complexity all the more.
If you have any suggestions of any other way that could guarantee
idleness that would be great.


I was working on a somewhat similar stuff for x86 platforms, and I am
almost ready to publish that on github. I can notify you when I do so
if you are interested. But here is a small presentation of the approach
that I did on Plumbers last year:

https://youtu.be/Opk92aQyvt0?t=8266

(the link points to the start of my talk)

Sure thing. Do notify me when it comes up.
I would be happy to have a look at it.

--
Thanks!
Pratik