Re: [patch 07/17] LTTng instrumentation - timer

From: Mathieu Desnoyers
Date: Wed Jul 16 2008 - 10:34:25 EST


* Peter Zijlstra (peterz@xxxxxxxxxxxxx) wrote:
> On Tue, 2008-07-15 at 18:26 -0400, Mathieu Desnoyers wrote:
> > plain text document attachment (lttng-instrumentation-timer.patch)
> > Instrument timer activity (timer set, expired, current time updates) to keep
> > information about the "real time" flow within the kernel. It can be used by a
> > trace analysis tool to synchronize information coming from various sources, e.g.
> > to merge traces with system logs.
> >
> > Those tracepoints are used by LTTng.
> >
> > About the performance impact of tracepoints (which is comparable to markers),
> > even without immediate values optimizations, tests done by Hideo Aoki on ia64
> > show no regression. His test case was using hackbench on a kernel where
> > scheduler instrumentation (about 5 events in code scheduler code) was added.
> > See the "Tracepoints" patch header for performance result detail.
> >
> > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx>
> > CC: 'Ingo Molnar' <mingo@xxxxxxx>
> > CC: "David S. Miller" <davem@xxxxxxxxxxxxx>
> > CC: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> > CC: 'Peter Zijlstra' <peterz@xxxxxxxxxxxxx>
> > CC: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
> > CC: 'Hideo AOKI' <haoki@xxxxxxxxxx>
> > CC: Takashi Nishiie <t-nishiie@xxxxxxxxxxxxxxxxxx>
> > CC: 'Steven Rostedt' <rostedt@xxxxxxxxxxx>
> > CC: Eduard - Gabriel Munteanu <eduard.munteanu@xxxxxxxxxxx>
> > ---
> > include/trace/timer.h | 24 ++++++++++++++++++++++++
> > kernel/itimer.c | 5 +++++
> > kernel/timer.c | 8 +++++++-
> > 3 files changed, 36 insertions(+), 1 deletion(-)
> >
> > Index: linux-2.6-lttng/kernel/itimer.c
> > ===================================================================
> > --- linux-2.6-lttng.orig/kernel/itimer.c 2008-07-15 14:49:14.000000000 -0400
> > +++ linux-2.6-lttng/kernel/itimer.c 2008-07-15 15:14:28.000000000 -0400
> > @@ -12,6 +12,7 @@
> > #include <linux/time.h>
> > #include <linux/posix-timers.h>
> > #include <linux/hrtimer.h>
> > +#include <trace/timer.h>
> >
> > #include <asm/uaccess.h>
> >
> > @@ -132,6 +133,8 @@ enum hrtimer_restart it_real_fn(struct h
> > struct signal_struct *sig =
> > container_of(timer, struct signal_struct, real_timer);
> >
> > + trace_timer_itimer_expired(sig);
> > +
> > kill_pid_info(SIGALRM, SEND_SIG_PRIV, sig->leader_pid);
> >
> > return HRTIMER_NORESTART;
> > @@ -157,6 +160,8 @@ int do_setitimer(int which, struct itime
> > !timeval_valid(&value->it_interval))
> > return -EINVAL;
> >
> > + trace_timer_itimer_set(which, value);
> > +
> > switch (which) {
> > case ITIMER_REAL:
> > again:
> > Index: linux-2.6-lttng/kernel/timer.c
> > ===================================================================
> > --- linux-2.6-lttng.orig/kernel/timer.c 2008-07-15 14:51:50.000000000 -0400
> > +++ linux-2.6-lttng/kernel/timer.c 2008-07-15 15:14:28.000000000 -0400
> > @@ -37,12 +37,14 @@
> > #include <linux/delay.h>
> > #include <linux/tick.h>
> > #include <linux/kallsyms.h>
> > +#include <trace/timer.h>
> >
> > #include <asm/uaccess.h>
> > #include <asm/unistd.h>
> > #include <asm/div64.h>
> > #include <asm/timex.h>
> > #include <asm/io.h>
> > +#include <asm/irq_regs.h>
> >
> > u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES;
> >
> > @@ -288,6 +290,7 @@ static void internal_add_timer(struct tv
> > i = (expires >> (TVR_BITS + 3 * TVN_BITS)) & TVN_MASK;
> > vec = base->tv5.vec + i;
> > }
> > + trace_timer_set(timer);
> > /*
> > * Timers are FIFO:
> > */
> > @@ -1066,6 +1069,7 @@ void do_timer(unsigned long ticks)
> > {
> > jiffies_64 += ticks;
> > update_times(ticks);
> > + trace_timer_update_time(&xtime, &wall_to_monotonic);
> > }
>
> This is a very dangerous trace point - we're holding xtime lock here.
>
> Ah, I see you make that comment below too, are you sure you want to do
> this?
>

Yes, this is done on purpose : I want to know the link between evolution
of the "system time" and my more precise time source (habitually the TSC),
so I can know how to interleave events between system logs and low-level
traces.

However it involves that a tracer connected to this tracepoint cannot
take the xtime lock (or must be aware that it is already taken). Since
LTTng has been designed to support NMIs, and because seqlocks and NMIs
does not mix well together (cause deadlocks), I have no such concern. I
created my own RCU-based 32-to-64 bits counter extension infrastructure
for that precise purpose (so I could support architectures which provide
hardware counters with fewer than 64 bits).

You'll notice that as tracepoints are added, one must be more and more
careful about what locks it takes or what parts of kernel infrastructure
it uses in its tracer. But nobody said tracing was easy. ;-)

Mathieu

> Thomas, any input?
>
> > #ifdef __ARCH_WANT_SYS_ALARM
> > @@ -1147,7 +1151,9 @@ asmlinkage long sys_getegid(void)
> >
> > static void process_timeout(unsigned long __data)
> > {
> > - wake_up_process((struct task_struct *)__data);
> > + struct task_struct *task = (struct task_struct *)__data;
> > + trace_timer_timeout(task);
> > + wake_up_process(task);
> > }
> >
> > /**
> > Index: linux-2.6-lttng/include/trace/timer.h
> > ===================================================================
> > --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> > +++ linux-2.6-lttng/include/trace/timer.h 2008-07-15 15:14:28.000000000 -0400
> > @@ -0,0 +1,24 @@
> > +#ifndef _TRACE_TIMER_H
> > +#define _TRACE_TIMER_H
> > +
> > +#include <linux/tracepoint.h>
> > +
> > +DEFINE_TRACE(timer_itimer_expired,
> > + TPPROTO(struct signal_struct *sig),
> > + TPARGS(sig));
> > +DEFINE_TRACE(timer_itimer_set,
> > + TPPROTO(int which, struct itimerval *value),
> > + TPARGS(which, value));
> > +DEFINE_TRACE(timer_set,
> > + TPPROTO(struct timer_list *timer),
> > + TPARGS(timer));
> > +/*
> > + * xtime_lock is taken when kernel_timer_update_time tracepoint is reached.
> > + */
> > +DEFINE_TRACE(timer_update_time,
> > + TPPROTO(struct timespec *_xtime, struct timespec *_wall_to_monotonic),
> > + TPARGS(_xtime, _wall_to_monotonic));
> > +DEFINE_TRACE(timer_timeout,
> > + TPPROTO(struct task_struct *p),
> > + TPARGS(p));
> > +#endif
> >
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/