Re: [PATCH] tracing/function-graph-tracer: prevent from hrtimer interrupt eternal loop

From: Frédéric Weisbecker
Date: Wed Dec 17 2008 - 19:18:00 EST


2008/12/18 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>:
> On Thu, 18 Dec 2008 00:52:47 +0100
> Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>
>> Impact: fix a system hang on slow systems
>>
>> While testing the function graph tracer on VirtualBox, I had a system hang
>> immediately after enabling the tracer.
>> If hrtimer is enabled on kernel, a slow system can spend too much time during
>> tracing the hrtimer_interrupt which will do eternal loops, assuming it always
>> have to retry its process because too much time elapsed during its time
>> update.
>> Now we provide a feature which lurks at the number of retries on
>> hrtimer_interrupt. After 10 retries, the function graph tracer will finally stop
>> its tracing.
>>
>> Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
>> ---
>> arch/x86/kernel/ftrace.c | 2 ++
>> include/linux/ftrace.h | 31 +++++++++++++++++++++++++++++++
>> kernel/hrtimer.c | 5 +++++
>> 3 files changed, 38 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
>> index 1b43086..244f178 100644
>> --- a/arch/x86/kernel/ftrace.c
>> +++ b/arch/x86/kernel/ftrace.c
>> @@ -389,6 +389,8 @@ void ftrace_nmi_exit(void)
>>
>> #endif /* !CONFIG_DYNAMIC_FTRACE */
>>
>> +DEFINE_PER_CPU(int, ftrace_graph_hrtimer_loop) = LOCAL_INIT(0);
>
> Strange. This has type `int', but we initialise it with a local_t?


Oops...


>> /* Add a function return address to the trace stack on thread info.*/
>> static int push_return_trace(unsigned long ret, unsigned long long time,
>> unsigned long func, int *depth)
>> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
>> index 286af82..39be782 100644
>> --- a/include/linux/ftrace.h
>> +++ b/include/linux/ftrace.h
>> @@ -408,6 +408,34 @@ extern void unregister_ftrace_graph(void);
>> extern void ftrace_graph_init_task(struct task_struct *t);
>> extern void ftrace_graph_exit_task(struct task_struct *t);
>>
>> +/*
>> + * Those two functions prevent from eternal loops on hrtimer_interrupt
>
> "These two functions prevent infinite ..."
>
>> + * because the function graph tracer can cause two much time processing
>
> "can consume too much"
>


Yeah, I had bad marks at school in english :-)


>> + * on slow systems, and the hrtimer_interrupt can relaunch its processing
>> + * if too much time elapsed during its previous time update.
>> + *
>> + * Note that we don't need an atomic counter for loops here because
>> + * interrupts are disabled in hrtimer_interrupt.
>> + */
>> +
>> +extern DEFINE_PER_CPU(int, ftrace_graph_hrtimer_loop);
>
> We have DECLARE_PER_CPU() for this.


Ok.


>> +static inline void ftrace_graph_hrtimer_enter(void)
>> +{
>> + __get_cpu_var(ftrace_graph_hrtimer_loop) = 0;
>> +}
>> +
>> +static inline void ftrace_graph_hrtimer_retry(void)
>> +{
>> + __get_cpu_var(ftrace_graph_hrtimer_loop)++;
>> +
>> + if (__get_cpu_var(ftrace_graph_hrtimer_loop) == 10) {
>> + ftrace_graph_stop();
>> + printk(KERN_WARNING "function-graph-tracer: hrtimer_interrupt "
>> + "exceeded 10 loops. Stopping tracing\n");
>> + }
>> +}
>
> I suspect this code would end up considerably neater (and less likely
> to break the build) if these functons were not inlined. For a start,
> ftrace_graph_hrtimer_loop would not need global scope.


I agree. I wanted to make them inlined to minimize the performance issues on
hrtimer_interrupt. But that will only have an impact if this tracer is
configured.
I will move these functions and the per_cpu counters in
kernel/trace/ftrace.c and make
ftrace_graph_hrtimer_loop as static.

>> static inline int task_curr_ret_stack(struct task_struct *t)
>> {
>> return t->curr_ret_stack;
>> @@ -437,6 +465,9 @@ static inline int task_curr_ret_stack(struct task_struct *tsk)
>>
>> static inline void pause_graph_tracing(void) { }
>> static inline void unpause_graph_tracing(void) { }
>> +
>> +static inline void ftrace_graph_hrtimer_enter(void) { };
>> +static inline void ftrace_graph_hrtimer_retry(void) { };
>
> Unneeded semicolons.
>
> Please consider doing
>
> static inline void ftrace_graph_hrtimer_enter(void)
> {
> }


Ok.


>> #endif
>>
>> #ifdef CONFIG_TRACING
>> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
>> index b741f85..662eb53 100644
>> --- a/kernel/hrtimer.c
>> +++ b/kernel/hrtimer.c
>> @@ -44,6 +44,7 @@
>> #include <linux/seq_file.h>
>> #include <linux/err.h>
>> #include <linux/debugobjects.h>
>> +#include <linux/ftrace.h>
>>
>> #include <asm/uaccess.h>
>>
>> @@ -1186,7 +1187,11 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>> cpu_base->nr_events++;
>> dev->next_event.tv64 = KTIME_MAX;
>>
>> + ftrace_graph_hrtimer_enter();
>> retry:
>> + /* Prevent from eternal loop on slow systems while tracing */
>> + ftrace_graph_hrtimer_retry();
>> +
>> now = ktime_get();
>>
>> expires_next.tv64 = KTIME_MAX;
>> --
>> 1.6.0.4
>


Thanks Andrew!
I will send a V2 to apply your comments.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/