Re: [PATCH 0/9] perf: Adding better precise_ip field handling

From: Ingo Molnar
Date: Sat May 11 2013 - 03:50:24 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Fri, May 10, 2013 at 12:55:36PM +0200, Ingo Molnar wrote:
> > Look at the tools/perf/ patches, they don't actually need or use that
> > information to adjust for skid!
> >
> > If user-space wants _that_ level of control because it wants to correct
> > for skid (if there's skid), or if it wants to display to the user how
> > precise the profiling is, then they can do the (much) more complex probing
> > dance.
> >
> > What is absolutely indefensible is to not give a good shortcut for the
> > most common case of 'give me the most precise cycles event you got'...
>
> That's not what I'm saying... the user (not userspace, but you and me)
> when staring at perf output need to interpret the result.
>
> If you don't know WTF the thing actually measured, how are you going to
> do that?

That's really a red herring: there's absolutely no reason why the
kernel could not pass back the level of precision it provided.

You are also over-rating the importance of such details - most developers
will assume when looking at profiler output that it's a statistical result
- and being happy when it happens to be "absolutely accurate" instead of
just "very accurate"...

> > > I see such a feature only causing confusion; I told it to be
> > > precise, therefore this register op after the memory load really is
> > > the more expensive thing.
> >
> > You are creating confusion where there's none: "give me the best
> > profiling you've got" is a pretty reasonable thing to ask.
>
> Only if it then tells you what you got. It doesn't do that.

I'm not against the kernel telling what precision it gave us, at all. That
could be solved by the kernel setting the precision field in the
PERF_COUNT_CYCLES_PRECISE case or so.

I'm against you apparently recommending a complex probing method to get
something the kernel ought to get us straight away via much simpler
ways...

Complexity does not primarily result in people doing things 'smarter'. It
primarily results in people _not using the feature at all_.

> > The thing is, there's variations in the quality of profiling between
> > CPUs, sometimes even between CPU models. 99.999% of the people don't
> > care about that, because 99.9% of the time the profile is unambiguous:
> > functions are typically big enough, with the overhead somewhere in the
> > middle, so skid just doesn't matter.
>
> Sure at function level it doesn't matter, but once you found your most
> expensive function very often the next question is _why_ is it
> expensive.
>
> At that point you're going to stare at asm output. The moment you do
> that you need to know the type of output you're staring at.

FYI, very few developers are actually looking at the assembly output
because very few developers _know_ assembly to begin with.

They are looking at things like sysprof or perf report output, maybe at
the annotated _source_ code and that's it.

The mapping to source code is fuzzy to begin with, with inlining, loop
unrolling and other compiler optimizations being a far bigger effect than
skid.

So the fuzz created by skid is relatively small - but it's nice when it's
gone and obviously it's helpful when you are looking at assembly output.

> Also, if you think function level output is the most relevant one, you
> shouldn't use PEBS at all. PEBS has an issue with REP prefixes, it
> severely under accounts the cycles spend on them. And since exact
> placement doesn't matter (as you just argued) the little skid you have
> is irrelevant.
>
> So either skid matters and you need to know what type of output you've
> got, or it doesn't and the whole precise thing is irrelevant at best.

That's just another plain silly argument: having more precise results is
obviously useful even if you don't use a magnifying lense. Sometimes
functions are small and skid results in the wrong function being credited
with overhead.

It's also immaterial: there's no reason why the kernel couldn't feed back
the level of precision it offers, to user-space, via a small, simple
variation to the existing syscall interface.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/