Re: [PATCH 4/4] trace: profile all if conditionals

From: Andi Kleen
Date: Sun Nov 23 2008 - 15:14:37 EST


On Sun, Nov 23, 2008 at 02:56:42PM -0500, Steven Rostedt wrote:

You snipped my earlier suggestion? Can't you just use kernel gcov
for this? Frankly it's output is infinitely more useful than
the one from your patch. It also addresses Andrew's suggestion
of profiling other control flow constructs.

I know it's not ftrace, but not everything is bad just because it's
not seen through the ftrace spectacles @)

> On Sun, 23 Nov 2008, Andi Kleen wrote:
> > Steven Rostedt <rostedt@xxxxxxxxxxx> writes:
>
> > > This adds a significant amount of overhead and should only be used
> > > by those analyzing their system.
> >
> > Often this can be also done using CPU performance counters. Might
> > be a cheaper option.
>
> I'd love to add an option that could hook into any arch with HW support
> for this. We could dump out the same information, but just a different way
> to gather it. But I'm still ignorant to the use of CPU performance
> counters and how to find which branch matches which if.

The theory is quite simple. Typically there are events for
"taken branches" and others for "non taken". So you set up
two counters using the existing oprofile support and collect
the samples. Then combine these two sample streams.

[Sometimes you have to also synthesize these
events because CPUs like to count predicted and mispredicted
(in the CPU sense) differently, but that's also quite simple
(on x86/core these can be all specified in the unit mask for
the same event)]

The sampling will be statistical (not every branch counted),
but that's ok because only branches that are executed a significant
time are interesting anyways.

The only problem is you have to map back to source code lines, which
can be done in user space based on the oprofile output and some
addr2line or similar hacks. oprofile can also do this,
although it gives this information only indirectly so a custom
tool might be easier.

Note it doesn't even need new kernel code, assuming
the architecture already has a working full oprofile implementation.

The main advantage over gcov would be lower runtime overhead,
although gcov is giving the better output (and is already working
too)

-Andi



>
> -- Steve
>

--
ak@xxxxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/