[GIT PULL 00/23] perf/core improvements and fixes

From: Arnaldo Carvalho de Melo
Date: Wed Jan 25 2017 - 08:55:02 EST


Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 9f6f941e25bad8fcffc24d10762962d62edba767:

Merge tag 'perf-core-for-mingo-4.11-20170117' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-01-18 10:06:20 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170125

for you to fetch changes up to bb6457b8af267b92ba3e752d9ccb3a4d4965a912:

perf ftrace: Make 'function_graph' be the default tracer (2017-01-25 10:37:27 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Disassemble x86 branch stacks using "-F brstackasm" and Intel PT
traces with "-F asm" in 'perf script' using Intel's XED library.

Since this is not widely available in pre-packaged forms in distros,
it will not be automatically feature probed, needing to be explicitly
enabled at perf build time using the XED=1 and optionally the XED_DIR
make command line variables.

See the changeset log messages and committer notes to see it in action
(Andi Kleen)

- Introduce 'perf ftrace' a perf front end to the kernel's ftrace
function and function_graph tracer, defaulting to the "function_graph"
tracer, more work will be done in reviving this effort, forward porting
it from its initial patch submission (Namhyung Kim)

- Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)

Fixes:

- Fix wrong register name for arm64, used in 'perf probe' (He Kuang)

- Fix map offsets in relocation in libbpf (Joe Stringer)

- Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic)

Infrastructure:

- libbpf prog functions sync with what is exported via uapi (Joe Stringer)

Trivial:

- Remove unnecessary checks and assignments in 'perf probe's
try_to_find_absolute_address() (Markus Elfring)

Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>

----------------------------------------------------------------
Andi Kleen (5):
perf tools: Add probing for the XED disassembler library
perf tools: Add one liner warning for disabled features
perf tools: Add disassembler for x86 using the XED library
perf script: Add support for printing assembler
perf script: Add "brstackasm" output for branch stacks

Arnaldo Carvalho de Melo (3):
perf scripting perl: Do not die() when not founding event for a type
perf tools: Propagate perf_config() errors
perf ftrace: Make 'function_graph' be the default tracer

He Kuang (1):
perf probe: Fix wrong register name for arm64

Jiri Olsa (4):
perf hists browser: Put hist_entry folding logic into single function
perf hists browser: Add e/c hotkeys to expand/collapse callchain for current entry
perf c2c report: Display Total records column in offset view
perf c2c report: Coalesce by default only by pid,iaddr

Joe Stringer (4):
tools lib bpf: Fix map offsets in relocation
tools lib bpf: Define prog_type fns with macro
tools lib bpf: Add set/is helpers for all prog types
tools lib bpf: Add libbpf_get_error()

Markus Elfring (2):
perf probe: Delete an unnecessary check in try_to_find_absolute_address()
perf probe: Delete an unnecessary assignment in try_to_find_absolute_address()

Matija Glavinic Pecotic (1):
perf unwind: Fix looking up dwarf unwind stack info

Namhyung Kim (3):
perf util: Save pid-cmdline mapping into tracing header
perf util: Add more debug message on failure path
perf ftrace: Introduce new 'ftrace' tool

tools/build/Makefile.feature | 4 +-
tools/build/feature/Makefile | 6 +-
tools/build/feature/test-all.c | 14 +
tools/build/feature/test-xed.c | 9 +
tools/lib/bpf/libbpf.c | 69 +++--
tools/lib/bpf/libbpf.h | 14 +-
tools/perf/Build | 1 +
tools/perf/Documentation/perf-c2c.txt | 2 +-
tools/perf/Documentation/perf-ftrace.txt | 36 +++
tools/perf/Documentation/perf-script.txt | 15 +-
tools/perf/Makefile.config | 29 ++
tools/perf/Makefile.perf | 3 +
tools/perf/arch/arm64/include/dwarf-regs-table.h | 12 +-
tools/perf/arch/x86/util/Build | 3 +
tools/perf/arch/x86/util/dis.c | 86 ++++++
tools/perf/builtin-c2c.c | 3 +-
tools/perf/builtin-ftrace.c | 243 +++++++++++++++
tools/perf/builtin-help.c | 6 +-
tools/perf/builtin-kmem.c | 8 +-
tools/perf/builtin-record.c | 4 +-
tools/perf/builtin-report.c | 4 +-
tools/perf/builtin-script.c | 338 ++++++++++++++++++++-
tools/perf/builtin-top.c | 4 +-
tools/perf/builtin.h | 1 +
tools/perf/command-list.txt | 1 +
tools/perf/perf.c | 16 +-
tools/perf/tests/llvm.c | 2 +-
tools/perf/ui/browsers/hists.c | 60 ++--
tools/perf/util/Build | 1 +
tools/perf/util/callchain.c | 14 +-
tools/perf/util/config.c | 9 +-
tools/perf/util/data-convert-bt.c | 7 +-
tools/perf/util/dis.c | 15 +
tools/perf/util/dis.h | 23 ++
tools/perf/util/dso.c | 48 ++-
tools/perf/util/header.c | 4 +-
tools/perf/util/hist.c | 4 +-
tools/perf/util/intel-pt.c | 4 +-
tools/perf/util/llvm-utils.c | 4 +-
tools/perf/util/probe-event.c | 11 +-
.../perf/util/scripting-engines/trace-event-perl.c | 6 +-
tools/perf/util/trace-event-info.c | 33 +-
tools/perf/util/trace-event-parse.c | 17 ++
tools/perf/util/trace-event-read.c | 77 ++++-
tools/perf/util/trace-event.h | 1 +
tools/perf/util/unwind-libunwind-local.c | 54 +++-
46 files changed, 1192 insertions(+), 133 deletions(-)
create mode 100644 tools/build/feature/test-xed.c
create mode 100644 tools/perf/Documentation/perf-ftrace.txt
create mode 100644 tools/perf/arch/x86/util/dis.c
create mode 100644 tools/perf/builtin-ftrace.c
create mode 100644 tools/perf/util/dis.c
create mode 100644 tools/perf/util/dis.h

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.

Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

# dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 debian:experimental-x-arm64: Ok
11 debian:experimental-x-mips: Ok
12 debian:experimental-x-mips64: Ok
13 debian:experimental-x-mipsel: Ok
14 fedora:20: Ok
15 fedora:21: Ok
16 fedora:22: Ok
17 fedora:23: Ok
18 fedora:24: Ok
19 fedora:24-x-ARC-uClibc: Ok
20 fedora:25: Ok
21 fedora:rawhide: Ok
22 mageia:5: Ok
23 opensuse:13.2: Ok
24 opensuse:42.1: Ok
25 opensuse:tumbleweed: Ok
26 ubuntu:12.04.5: Ok
27 ubuntu:14.04.4-x-linaro-arm64: Ok
28 ubuntu:15.10: Ok
29 ubuntu:16.04: Ok
30 ubuntu:16.04-x-arm: Ok
31 ubuntu:16.04-x-arm64: Ok
32 ubuntu:16.04-x-powerpc: Ok
33 ubuntu:16.04-x-powerpc64: Ok
34 ubuntu:16.04-x-powerpc64el: Ok
35 ubuntu:16.04-x-s390: Ok
36 ubuntu:16.10: Ok
#

# uname -a
Linux jouet 4.9.0+ #2 SMP Wed Dec 21 11:54:44 BRT 2016 x86_64 x86_64 x86_64 GNU/Linux
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: PERF_RECORD_* events & perf_sample fields : Ok
7: Parse perf pmu format : Ok
8: DSO data read : Ok
9: DSO data cache : Ok
10: DSO data reopen : Ok
11: Roundtrip evsel->name : Ok
12: Parse sched tracepoints fields : Ok
13: syscalls:sys_enter_openat event fields : Ok
14: Setup struct perf_event_attr : Ok
15: Match and link multiple hists : Ok
16: 'import perf' in python : Ok
17: Breakpoint overflow signal handler : Ok
18: Breakpoint overflow sampling : Ok
19: Number of exit events of a simple workload : Ok
20: Software clock events period values : Ok
21: Object code reading : Ok
22: Sample parsing : Ok
23: Use a dummy software event to keep tracking: Ok
24: Parse with no sample_id_all bit set : Ok
25: Filter hist entries : Ok
26: Lookup mmap thread : Ok
27: Share thread mg : Ok
28: Sort output of hist entries : Ok
29: Cumulate child hist entries : Ok
30: Track with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: kmod_path__parse : Ok
34: Thread map : Ok
35: LLVM search and compile :
35.1: Basic BPF llvm compile : Ok
35.2: kbuild searching : Ok
35.3: Compile source for BPF prologue generation: Ok
35.4: Compile source for BPF relocation : Ok
36: Session topology : Ok
37: BPF filter :
37.1: Basic BPF filtering : Ok
37.2: BPF prologue generation : Ok
37.3: BPF relocation checker : Ok
38: Synthesize thread map : Ok
39: Remove thread map : Ok
40: Synthesize cpu map : Ok
41: Synthesize stat config : Ok
42: Synthesize stat : Ok
43: Synthesize stat round : Ok
44: Synthesize attr update : Ok
45: Event times : Ok
46: Read backward ring buffer : Ok
47: Print cpu map : Ok
48: Probe SDT events : Ok
49: is_printable_array : Ok
50: Print bitmap : Ok
51: perf hooks : Ok
52: builtin clang support : Skip (not compiled in)
53: unit_number__scnprintf : Ok
54: x86 rdpmc : Ok
55: Convert perf time to TSC : Ok
56: DWARF unwind : Ok
57: x86 instruction decoder - new instructions : Ok
58: Intel cqm nmi context read : Skip
#

$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_libbpf_O: make NO_LIBBPF=1
make_no_demangle_O: make NO_DEMANGLE=1
make_tags_O: make tags
make_no_libbionic_O: make NO_LIBBIONIC=1
make_help_O: make help
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_newt_O: make NO_NEWT=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_auxtrace_O: make NO_AUXTRACE=1
make_install_prefix_O: make install prefix=/tmp/krava
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_debug_O: make DEBUG=1
make_util_map_o_O: make util/map.o
make_no_libelf_O: make NO_LIBELF=1
make_install_bin_O: make install-bin
make_no_libpython_O: make NO_LIBPYTHON=1
make_install_O: make install
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_gtk2_O: make NO_GTK2=1
make_pure_O: make
make_no_libperl_O: make NO_LIBPERL=1
make_no_slang_O: make NO_SLANG=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_libaudit_O: make NO_LIBAUDIT=1
make_clean_all_O: make clean all
make_perf_o_O: make perf.o
make_static_O: make LDFLAGS=-static
make_doc_O: make doc
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
OK
make: Leaving directory '/home/acme/git/linux/tools/perf'
$