Re: [PATCH 0/3] powerpc/perf: Enable linking with libunwind

From: Jiri Olsa
Date: Thu Mar 06 2014 - 12:49:56 EST


On Wed, Mar 05, 2014 at 08:41:56PM -0800, Sukadev Bhattiprolu wrote:
> When we try to create backtraces (call-graphs) with the perf tool
>
> perf record -g /tmp/sprintft
>
> we get backtraces with duplicate arcs for sprintft[1]:
>
> 14.61% sprintft libc-2.18.so [.] __random
> |
> --- __random
> |
> |--61.09%-- __random
> | |
> | |--97.18%-- rand
> | | do_my_sprintf
> | | main
> | | generic_start_main.isra.0
> | | __libc_start_main
> | | 0x0
> | |
> | --2.82%-- do_my_sprintf
> | main
> | generic_start_main.isra.0
> | __libc_start_main
> | 0x0
> |
> --38.91%-- rand
> |
> |--92.90%-- rand
> | |
> | |--99.87%-- do_my_sprintf
> | | main
> | | generic_start_main.isra.0
> | | __libc_start_main
> | | 0x0
> | --0.13%-- [...]
> |
> --7.10%-- do_my_sprintf
> main
> generic_start_main.isra.0
> __libc_start_main
> 0x0
>
> (where the two arcs both have the same backtrace but are not merged).
>
> Linking with libunwind seems to create better backtraces. While x86 and
> ARM processors have support for linking with libunwind but Power does not.
> This patchset is an RFC for linking with libunwind.
>
> With this patchset and running:
>
> /tmp/perf record --call-graph=dwarf,8192 /tmp/sprintft
>
> the backtrace is:
>
> 14.94% sprintft libc-2.18.so [.] __random
> |
> --- __random
> rand
> do_my_sprintf
> main
> generic_start_main.isra.0
> __libc_start_main
> (nil)
>
> This appears better.
>
> One downside is that we now need the kernel to save the entire user stack
> (the 8192 in the command line is the default user stack size).
>
> A second issue is that this invocation of perf (with --call-graph=dwarf,8192)
> seems to fail for backtraces involving tail-calls[2]
>
> /tmp/perf record -g ./tailcall
> gives
>
> 20.00% tailcall tailcall [.] work2
> |
> --- work2
> work
>
> shows the tail function 'work2' as "called from" 'work()'
>
> But with libunwind:
>
> /tmp/perf record --call-graph=dwarf,8192 ./tailcall
> we get:
>
> 20.50% tailcall tailcall [.] work2
> |
> --- work2
>
> the caller of 'work' is not shown.
>
> I am debugging this, but would appreciate any feedback/pointers on the
> patchset/direction:
>
> - Does libunwind need the entire user stack to work or are there
> optimizations we can do to save the minimal entries for it to
> perform the unwind.

AFAIK you dont need to provide whole stack, but the more
you have the bigger chance you'll get full(er) backtrace

>
> - Does libunwind work with tailcalls like the one above ?

not sure, but if you have x86 alternative to your tailcall (i cannot
read ppc assembly) I could try on x86 ;-)

CC-ing Jean, as he might have seen this issue..


>
> - Are there benefits to linking with libunwind (even if it does not
> yet solve the tailcall problem)

provides backtrace for binaries/distros/archs compiled without framepointer

>
> - Are there any examples of using libdwarf to solve the tailcall
> issue ?


btw there's now remote unwinder in elfutils (version 0.158)
the perf supprot is in Arnaldo's perf/core tree

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/