Re: [PATCH v3 2/2] sched: consider missed ticks when updating global cpu load

From: Peter Zijlstra
Date: Fri Oct 02 2015 - 12:04:52 EST


On Fri, Oct 02, 2015 at 04:46:14PM +0900, byungchul.park@xxxxxxx wrote:
> From: Byungchul Park <byungchul.park@xxxxxxx>
>
> in hrtimer_interrupt(), the first tick_program_event() can be failed
> because the next timer could be already expired due to,
> (see the comment in hrtimer_interrupt())
>
> - tracing
> - long lasting callbacks

If anything keeps interrupts disabled for longer than 1 tick, you'd
better go fix that.

> - being scheduled away when running in a VM

Not sure how much I should care about that, and this patch is completely
wrong for that anyhow.

And this case in hrtimer_interrupt() is basically a fail case, if you
hit that, you've got bigger problems. The solution is to rework things
so you don't get there.


> in the case that the first tick_program_event() is failed, the second
> tick_program_event() set the expired time to more than one tick later.
> then next tick can happen after more than one tick, even though tick is
> not stopped by e.g. NOHZ.
>
> when the next tick occurs, update_process_times() -> scheduler_tick()
> -> update_cpu_load_active() is performed, assuming the distance between
> last tick and current tick is 1 tick! it's wrong in this case. thus,
> this abnormal case should be considered in update_cpu_load_active().

Everything in update_process_times() assumes 1 tick, just fixing up
one function inside that callchain is wrong -- I've already told you
that.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/