Re: [PATCH 1/3] perf callchain: Convert children list to rbtree

From: Frederic Weisbecker
Date: Tue Sep 10 2013 - 07:34:49 EST


On Tue, Sep 10, 2013 at 12:25:54PM +0200, Ingo Molnar wrote:
>
> * Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
>
> > On Tue, Sep 10, 2013 at 05:24:16PM +0900, Namhyung Kim wrote:
> > > From: Namhyung Kim <namhyung.kim@xxxxxxx>
> > >
> > > Current collapse stage has a scalability problem which can be
> > > reproduced easily with parallel kernel build. This is because it
> > > needs to traverse every children of callchain linearly during the
> > > collapse/merge stage. Convert it to rbtree reduced the overhead
> > > significantly.
> > >
> > > On my 400MB perf.data file which recorded with make -j32 kernel build:
> >
> >
> > nice!!!
>
> Nice indeed!
>
> > tried on 2.6 GB data file from kernel make -j64 and got report speed up
> > from 'never' to 2m52.756s ;-)
>
> It's still rather long though, unacceptable for everyday usage :-/
>
> Frederic thought that we could reduce minimize collapsing to begin with.
>
> Frederic, could you outline that in more detail please?

Yeah. Currently when we sort by comm, hists are first sorted by tid. Then
in the end of the record, the hists are compared and those that have the
same comm are collapsed in one.

So what I'm trying to do now is to gather those hists from the very beginning,
which should remove the need for collapsing.

>
> Thanks,
>
> Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/