Re: [PATCH 4/7] tracing: new format for specialized trace points

From: Steven Rostedt
Date: Tue Mar 17 2009 - 11:41:25 EST



On Tue, 17 Mar 2009, Christoph Hellwig wrote:

> On Tue, Mar 10, 2009 at 12:57:14AM -0400, Steven Rostedt wrote:
> > Here's the example. The only updated macro in this patch is the
> > sched_switch trace point.
>
> Note that we shouldn't keep two variants around long-term, that's
> just going to cause confusion.
>
> > The old method looked like this:
> >
> > TRACE_EVENT_FORMAT(sched_switch,
> > TP_PROTO(struct rq *rq, struct task_struct *prev,
> > struct task_struct *next),
> > TP_ARGS(rq, prev, next),
> > TP_FMT("task %s:%d ==> %s:%d",
> > prev->comm, prev->pid, next->comm, next->pid),
> > TRACE_STRUCT(
> > TRACE_FIELD(pid_t, prev_pid, prev->pid)
> > TRACE_FIELD(int, prev_prio, prev->prio)
> > TRACE_FIELD_SPECIAL(char next_comm[TASK_COMM_LEN],
> > next_comm,
> > TP_CMD(memcpy(TRACE_ENTRY->next_comm,
> > next->comm,
> > TASK_COMM_LEN)))
> > TRACE_FIELD(pid_t, next_pid, next->pid)
> > TRACE_FIELD(int, next_prio, next->prio)
> > ),
> > TP_RAW_FMT("prev %d:%d ==> next %s:%d:%d")
> > );
> >
> > The above method is hard to read and requires two format fields.
> >
> > The new method:
> >
> > /*
> > * Tracepoint for task switches, performed by the scheduler:
> > *
> > * (NOTE: the 'rq' argument is not used by generic trace events,
> > * but used by the latency tracer plugin. )
> > */
> > TRACE_EVENT(sched_switch,
> >
> > TP_PROTO(struct rq *rq, struct task_struct *prev,
> > struct task_struct *next),
> >
> > TP_ARGS(rq, prev, next),
> >
> > TP_STRUCT__entry(
> > __array( char, prev_comm, TASK_COMM_LEN )
> > __field( pid_t, prev_pid )
> > __field( int, prev_prio )
> > __array( char, next_comm, TASK_COMM_LEN )
> > __field( pid_t, next_pid )
> > __field( int, next_prio )
> > ),
> >
> > TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
> > __entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
> > __entry->next_comm, __entry->next_pid, __entry->next_prio),
> >
> > TP_fast_assign(
> > memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
> > __entry->prev_pid = prev->pid;
> > __entry->prev_prio = prev->prio;
> > memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
> > __entry->next_pid = next->pid;
> > __entry->next_prio = next->prio;
> > )
> > );
>
> While the idea behing it seems like an improvement to me, the
> implementation feel actually worse than the old one too me. I would
> expect this to look more like:
>
> struct trace_sched_switch {
> char prev_comm[TASK_COMM_LEN],
> pid_t prev_pid,
> int prev_prio,
> char next_comm[TASK_COMM_LEN],
> pid_t next_pid,
> int next_prio,
> }

We would love to do the above. The problem is that we also need a way
to automatically export the fields offset/size to userspace. Thus we use
the "__field()" and "__array()" macros to do this for us. Otherwise, we
need to do that manually.


>
> static void trace_sched_assign(struct trace_sched_switch *dst, struct rq *rq,
> struct task_struct *prev, struct task_struct *next)
> {
> memcpy(dst->next_comm, next->comm, TASK_COMM_LEN);
> dst->prev_pid = prev->pid;
> dst->prev_prio = prev->prio;
> memcpy(dst->prev_comm, prev->comm, TASK_COMM_LEN);
> dst->next_pid = next->pid;
> dst->next_prio = next->prio;
> };

This we could take out of the macro and make a function.

-- Steve

>
>
> TRACE_EVENT(sched_switch,
> trace_proto(struct rq *rq, struct task_struct *prev,
> struct task_struct *next),
> trace_args(rq, prev, next),
> trace_struct(struct trace_sched_switch),
> trace_assign(trace_sched_assign);
>
> trace_pretty_print("task %s:%d [%d] ==> %s:%d [%d]",
> __entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
> __entry->next_comm, __entry->next_pid, __entry->next_prio),
> );
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/