Re: [PATCH 5/6] perf_counter: add more context information

From: Peter Zijlstra
Date: Thu Apr 02 2009 - 14:42:50 EST


On Thu, 2009-04-02 at 20:34 +0200, Ingo Molnar wrote:
> * Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
>
> > On Thu, 2009-04-02 at 20:18 +0200, Ingo Molnar wrote:
> > > * Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
> > >
> > > > On Thu, 2009-04-02 at 13:36 +0200, Ingo Molnar wrote:
> > > >
> > > > > > -#define MAX_STACK_DEPTH 255
> > > > > > +#define MAX_STACK_DEPTH 254
> > > > > >
> > > > > > struct perf_callchain_entry {
> > > > > > - u64 nr;
> > > > > > + u32 nr, hv, kernel, user;
> > > > > > u64 ip[MAX_STACK_DEPTH];
> > > > > > };
> > > >
> > > > Oh, and Paul suggested using u16s right after I send it out. So
> > > > I'll either send an update or send a incremental in case you
> > > > already applied it.
> > >
> > > yes, that's probably a good idea. Although u8 might be even better -
> > > do we ever want to do more than 256 deep stack vectors? Even those
> > > would take quite some time to construct and pass down.
> >
> > We'd have to pad it with 4 more bytes to remain u64 aligned,
>
> ok, indeed.
>
> > [...] also, why restrict ourselves. That MAX_STACK_DEPTH limit is
> > trivially fixable if indeed someone finds its insufficient.
>
> well .. think about it: walking more than 256 stack frames for every
> IRQ event? Getting backtraces like:
>
> <func_0+0x123>
...
> <func_269+0x123>
>
> does that make much sense _per event_? How do you visualize it?

You can use it to calculate aggregate times. Eg. attribute the time
spend in func_0 to func_1 to func_2 etc. And use a tree view based on
these call-chains, allowing you to drill-down -- which is basically what
the sysprof GUI does.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/