Re: nohz problem with idle time on old hardware

From: Steven Rostedt
Date: Wed Nov 13 2013 - 10:31:41 EST


On Wed, 13 Nov 2013 10:21:53 -0500
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
trace-cmd record -p function -l '*nohz*' -l account_process_tick -e sched_switch
>
> rcu_sche-9 0d... 6858.618033: sched_switch: rcu_sched:9 [120] S ==> swapper/0:0 [120]
> <idle>-0 0.... 6858.618082: function: tick_nohz_idle_enter <-- cpu_startup_entry
> <idle>-0 0d... 6858.618101: function: __tick_nohz_idle_enter <-- tick_nohz_idle_enter
> <idle>-0 0d.s. 6858.621499: function: tick_nohz_stop_idle <-- tick_check_idle
> <idle>-0 0d.h. 6858.621550: function: account_process_tick <-- update_process_times
> <idle>-0 0d... 6858.621769: function: tick_nohz_irq_exit <-- irq_exit
> <idle>-0 0d... 6858.621787: function: __tick_nohz_idle_enter <-- irq_exit
> <idle>-0 0d.s. 6858.625500: function: tick_nohz_stop_idle <-- tick_check_idle
> <idle>-0 0d.h. 6858.625574: function: account_process_tick <-- update_process_times
> <idle>-0 0d... 6858.625818: function: tick_nohz_irq_exit <-- irq_exit
> <idle>-0 0d... 6858.625847: function: __tick_nohz_idle_enter <-- irq_exit
> <idle>-0 0d.s. 6858.629295: function: tick_nohz_stop_idle <-- tick_check_idle
> <idle>-0 0d... 6858.629351: function: tick_nohz_irq_exit <-- irq_exit
> <idle>-0 0d... 6858.629373: function: __tick_nohz_idle_enter <-- irq_exit
> <idle>-0 0d.s. 6858.629503: function: tick_nohz_stop_idle <-- tick_check_idle
> <idle>-0 0d.h. 6858.629569: function: account_process_tick <-- update_process_times
> <idle>-0 0d... 6858.629851: function: tick_nohz_irq_exit <-- irq_exit
> <idle>-0 0d... 6858.629881: function: __tick_nohz_idle_enter <-- irq_exit
> <idle>-0 0d.s. 6858.630412: function: tick_nohz_stop_idle <-- tick_check_idle
> <idle>-0 0.N.. 6858.630550: function: tick_nohz_idle_exit <-- cpu_startup_entry
> <idle>-0 0d... 6858.630605: sched_switch: swapper/0:0 [120] R ==> rcu_sched:9 [120]
>
> I'm not saying that we are actually getting into nohz, but something
> with the nohz code is messing with cpu accounting.

The trace does indeed show that a tick is happening, as the config has
HZ=250 (4ms) and we see a tick happen every 4ms. But for some reason,
we don't update the the idle time correctly when nohz is enabled.

When I say nohz is enabled, I mean that we don't have nohz=off in the
command line. There seems to be some difference between having nohz=off
and having nohz disabled at runtime.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/