Re: Kernel marker has no performance impact on ia64.

From: Peter Zijlstra
Date: Thu Jun 05 2008 - 04:13:05 EST


On Wed, 2008-06-04 at 19:22 -0400, Mathieu Desnoyers wrote:
> * Peter Zijlstra (peterz@xxxxxxxxxxxxx) wrote:

> > So are you proposing something like:
> >
> > static inline void
> > trace_sched_switch(struct task_struct *prev, struct task_struct *next)
> > {
> > trace_mark(sched_switch, prev, next);
> > }
> >
>
> Not exactly. Something more along the lines of
>
> static inline void
> trace_sched_switch(struct task_struct *prev, struct task_struct *next)
> {
> /* Internal tracers. */
> ftrace_sched_switch(prev, next);
> othertracer_sched_switch(prev, next);
> /*
> * System-wide tracing. Useful information is exported here.
> * Probes connecting to these markers are expected to only use the
> * information provided to them for data collection purpose. Type
> * casting pointers is discouraged.
> */
> trace_mark(kernel_sched_switch, "prev_pid %d next_pid %d prev_state %ld",
> prev->pid, next->pid, prev->state);
> }

Advantage of my method would be that ftrace (and othertracer) can use
the same marker and doesn't need yet another hoook.

> > dropping the silly fmt string but using the multiplex of trace_mark, and
> > then doing the stringify bit:
> >
> > "prev_pid %d next_pid %d prev_state %ld\n"
> >
> > in the actual tracer?
> >
>
> It would make much more sense to put this formatting information along
> with the trace point (e.g. in a a kernel/sched-trace.h header) rather
> that to hide it in a tracer (loadable module) because this information
> is an interface to the trace point.

I'm not sure - it seems to me it should be part of the tracer because
its a detail/subset of the actual data - rendering it useless for others
who'd like a different set.

> > IMHO the 'type safety' of the fmt string is over-rated, since it cannot
> > distinguish between a task_struct * or a bio *, both are a pointers -
> > and half arsed type safely is worse than no type safety.
> >
>
> I totally agree with you that not having the capacity to inspect pointer
> types is a problem for tracers which wants to receive the "raw" pointer
> and deal with the data they need like big boys. On the other hand, it
> requires them to be closely tied to the kernel internals and therefore
> it makes sense to call them directly from the tracing site, thus
> bypassing the marker format string.
>
> However, letting the marker specify the data format so a tracer could
> format it into a memory buffer (in a binary or text format, depending on
> the implementation) or so that a tool like systemtap can use this
> identified information without having to be closely tied to the kernel
> makes sense to me.

So s-tap is meant to parse this sting and interpret the varargs without
being closely tied to the kernel? - Somehow that doesn't make me feel
warm and fuzzy. That not only ties userspace to the information present
in the marker, but to the actual string as well.

The stronger you make this bind the less I like it.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/