Re: [PATCH] tracing/function-branch-tracer: enhancements for thetrace output

From: Ingo Molnar
Date: Fri Nov 28 2008 - 08:06:06 EST



* Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

>
> On Thu, 27 Nov 2008, Fr?d?ric Weisbecker wrote:
> > >
> > >> > ---------------------------------------------------------
> > >> > CPU) cost | function
> > >> > ---------------------------------------------------------
> > >> >
> > >> > 0) | sys_read() {
> > >> > 0) 0.331 us | fget_light();
> > >> > 0) | vfs_read() {
> > >> > 0) | rw_verify_area() {
> > >> > 0) | security_file_permission() {
> > >> > 0) 0.306 us | cap_file_permission();
> > >> > 0) 0.300 us | cap_file_permission();
> > >> > 0) 8.909 us | }
> > >> > 0) 0.993 us | }
> > >> > 0) 11.649 us |+ }
> > >> > 0) | do_sync_read() {
> > >> > 0) | sock_aio_read() {
> > >> > 0) | __sock_recvmsg() {
> > >> > 0) | security_socket_recvmsg() {
> > >> > 0) 100.319 us |! cap_socket_recvmsg();
> > >> > ---------------------------------------------------------
> > >
> > > Hm?
> >
> > I like it before the CPU number. The main purpose would be to scroll
> > quickly the file and find the overheads. That would be easy if set as
> > a first character.
>
> No, please keep the CPU # first. If anything, you will want to separate
> out the CPUs first. Otherwise you will see things all mixed up.
>
> Hmm, I could also add a per cpu files.
>
> debugfs/tracing/buffers/cpu0
> debugfs/tracing/buffers/cpu1
> debugfs/tracing/buffers/cpu2
> debugfs/tracing/buffers/cpu3
>
> That would print out the trace for a single CPU.

yes, doing that makes sense anyway: if someone wants to make the mistake
of capturing a _LOT_ of tracing events without pre-filtering them in the
kernel intelligently (just to be able to waste days sorting them apart
and analyzing them, and then billing the client for the cost), then it
would make sense to start 4 threads on all 4 CPUs, switch the ftrace
output mode into raw binary format and read the per cpu buffers into
userspace buffer.

We could perhaps even zero-copy it all straight to the pagecache: the
ring-buffer is 4K pages based already and the data is position
independent.

->splice_read() / ->splice_write() support for the ring-buffer would
nicely enable all of that, to all splice-able IO transports. (i.e. the
majority of IO transports that matter)

> BTW, I'm really not here. I'm on holiday eating turkeys.

okay.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/