Re: [RFC PATCH v2 0/8] Introduct cpu idle prediction functionality

From: Li, Aubrey
Date: Mon Oct 16 2017 - 03:44:49 EST


On 2017/10/14 9:14, Rafael J. Wysocki wrote:
> On Saturday, September 30, 2017 9:20:26 AM CEST Aubrey Li wrote:
>> We found under some latency intensive workloads, short idle periods occurs
>> very common, then idle entry and exit path starts to dominate, so it's
>> important to optimize them. To determine the short idle pattern, we need
>> to figure out how long of the coming idle and the threshold of the short
>> idle interval.
>>
>> A cpu idle prediction functionality is introduced in this proposal to catch
>> the short idle pattern.
>>
>> Firstly, we check the IRQ timings subsystem, if there is an event
>> coming soon.
>> -- https://lwn.net/Articles/691297/
>>
>> Secondly, we check the idle statistics of scheduler, if it's likely we'll
>> go into a short idle.
>> -- https://patchwork.kernel.org/patch/2839221/
>>
>> Thirdly, we predict the next idle interval by using the prediction
>> fucntionality in the idle governor if it has.
>>
>> For the threshold of the short idle interval, we record the timestamps of
>> the idle entry, and multiply by a tunable parameter at here:
>> -- /proc/sys/kernel/fast_idle_ratio
>>
>> We use the output of the idle prediction to skip turning tick off if a
>> short idle is determined in this proposal. Reprogramming hardware timer
>> twice(off and on) is expensive for a very short idle. There are some
>> potential optimizations can be done according to the same indicator.
>>
>> I observed when system is idle, the idle predictor reports 20/s long idle
>> and ZERO fast idle on one CPU. And when the workload is running, the idle
>> predictor reports 72899/s fast idle and ZERO long idle on the same CPU.
>>
>> Aubrey Li (8):
>> cpuidle: menu: extract prediction functionality
>> cpuidle: record the overhead of idle entry
>> cpuidle: add a new predict interface
>> tick/nohz: keep tick on for a fast idle
>> timers: keep sleep length updated as needed
>> cpuidle: make fast idle threshold tunable
>> cpuidle: introduce irq timing to make idle prediction
>> cpuidle: introduce run queue average idle to make idle prediction
>>
>> drivers/cpuidle/Kconfig | 1 +
>> drivers/cpuidle/cpuidle.c | 109 +++++++++++++++++++++++++++++++++++++++
>> drivers/cpuidle/governors/menu.c | 69 ++++++++++++++++---------
>> include/linux/cpuidle.h | 21 ++++++++
>> kernel/sched/idle.c | 14 ++++-
>> kernel/sysctl.c | 12 +++++
>> kernel/time/tick-sched.c | 7 +++
>> 7 files changed, 209 insertions(+), 24 deletions(-)
>>
>
> Overall, it looks like you could avoid stopping the tick every time the
> predicted idle duration is not longer than the tick interval in the first
> place.
> > Why don't you do that?

I didn't catch this.

Are you suggesting?

if(!cpu_stat.fast_idle)
tick_nohz_idle_enter()

Or you concern why the threshold can't simply be tick interval?

For the first, can_stop_idle_tick() is a better place to skip tick-off IMHO.
For the latter, if the threshold is close/equal to the tick, it's quite possible
the next event is the tick and no other else event.

Thanks,
-Aubrey