[GIT PULL] perf changes

From: Ingo Molnar
Date: Mon Jan 11 2016 - 07:56:19 EST


Linus,

Please pull the latest perf-core-for-linus git tree from:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf-core-for-linus

# HEAD: 3eb9ede23bdd96e9ba60e2b4d4d17a7c35d58448 Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Kernel side changes:

- Intel Knights Landing support. (Harish Chegondi)

- Intel Broadwell-EP uncore PMU support. (Kan Liang)

- Core code improvements. (Peter Zijlstra.)

- Event filter, LBR and PEBS fixes. (Stephane Eranian)

- Enable cycles:pp on Intel Atom. (Stephane Eranian)

- Add cycles:ppp support for Skylake. (Andi Kleen)

- Various x86 NMI overhead optimizations. (Andi Kleen)

- Intel PT enhancements. (Takao Indoh)

- AMD cache events fix. (Vince Weaver)

Tons of tooling changes:

- Show random perf tool tips in the 'perf report' bottom line (Namhyung Kim)

- perf report now defaults to --group if the perf.data file has grouped events, try it with:

# perf record -e '{cycles,instructions}' -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.093 MB perf.data (1247 samples) ]
# perf report
# Samples: 1K of event 'anon group { cycles, instructions }'
# Event count (approx.): 1955219195
#
# Overhead Command Shared Object Symbol

2.86% 0.22% swapper [kernel.kallsyms] [k] intel_idle
1.05% 0.33% firefox libxul.so [.] js::SetObjectElement
1.05% 0.00% kworker/0:3 [kernel.kallsyms] [k] gen6_ring_get_seqno
0.88% 0.17% chrome chrome [.] 0x0000000000ee27ab
0.65% 0.86% firefox libxul.so [.] js::ValueToId<(js::AllowGC)1>
0.64% 0.23% JS Helper libxul.so [.] js::SplayTree<js::jit::LiveRange*, js::jit::LiveRange>::splay
0.62% 1.27% firefox libxul.so [.] js::GetIterator
0.61% 1.74% firefox libxul.so [.] js::NativeSetProperty
0.61% 0.31% firefox libxul.so [.] js::SetPropertyByDefining


- Introduce the 'perf stat record/report' workflow:

Generate perf.data files from 'perf stat', to tap into the scripting
capabilities perf has instead of defining a 'perf stat' specific scripting
support to calculate event ratios, etc. Simple example:

$ perf stat record -e cycles usleep 1

Performance counter stats for 'usleep 1':

1,134,996 cycles

0.000670644 seconds time elapsed

$ perf stat report

Performance counter stats for '/home/acme/bin/perf stat record -e cycles usleep 1':

1,134,996 cycles

0.000670644 seconds time elapsed

$

It generates PERF_RECORD_ userspace records to store the details:

$ perf report -D | grep PERF_RECORD
0xf0 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 27637
0x118 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535
0x12a [0x40]: PERF_RECORD_STAT_CONFIG
0x16a [0x30]: PERF_RECORD_STAT
-1 -1 0x19a [0x40]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
0x1da [0x18]: PERF_RECORD_STAT_ROUND
[acme@ssdandy linux]$

An effort was made to make perf.data files generated like this to not
generate cryptic messages when processed by older tools.

The 'perf script' bits need rebasing, will go up later.

- Make command line options always available, even when they
depend on some feature being enabled, warning the user about
use of such options (Wang Nan)

- Support hw breakpoint events (mem:0xAddress) in the default output mode in
'perf script' (Wang Nan)

- Fixes and improvements for supporting annotating ARM binaries, support ARM
call and jump instructions, more work needed to have arch specific stuff
separated into tools/perf/arch/*/annotate/ (Russell King)

- Add initial 'perf config' command, for now just with a --list command to the
contents of the configuration file in use and a basic man page describing
its format, commands for doing edits and detailed documentation are being
reviewed and proof-read. (Taeung Song)

- Allows BPF scriptlets specify arguments to be fetched using
DWARF info, using a prologue generated at compile/build time (He Kuang, Wang Nan)

- Allow attaching BPF scriptlets to module symbols (Wang Nan)

- Allow attaching BPF scriptlets to userspace code using uprobe (Wang Nan)

- BPF programs now can specify 'perf probe' tunables via its section name,
separating key=val values using semicolons (Wang Nan)

Testing some of these new BPF features:

Use case: get callchains when receiving SSL packets, filter then in the
kernel, at arbitrary place.

# cat ssl.bpf.c
#define SEC(NAME) __attribute__((section(NAME), used))

struct pt_regs;

SEC("func=__inet_lookup_established hnum")
int func(struct pt_regs *ctx, int err, unsigned short port)
{
return err == 0 && port == 443;
}

char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
#
# perf record -a -g -e ssl.bpf.c
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ]
# perf script | head -30
swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux)
856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux)
2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux)
2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux)
96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux)
969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux)
2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux)
95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux)
1163ffa start_kernel ([kernel.vmlinux].init.text)
11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text)
1163623 x86_64_start_kernel ([kernel.vmlinux].init.text)

qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux)
8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux)
430a br_handle_frame_finish ([bridge])
48bc br_handle_frame ([bridge])
855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
#

Use 'perf probe' various options to list functions, see what variables can
be collected at any given point, experiment first collecting without a filter,
then filter, use it together with 'perf trace', 'perf top', with or without
callchains, if it explodes, please tell us!

- Introduce a new callchain mode: "folded", that will list per line
representations of all callchains for a give histogram entry, facilitating
'perf report' output processing by other tools, such as Brendan Gregg's
flamegraph tools (Namhyung Kim)

E.g:

# perf report | grep -v ^# | head
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
|
---cpu_startup_entry
|
|--12.07%--start_secondary
|
--6.30%--rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
#

Becomes, in "folded" mode:

# perf report -g folded | grep -v ^# | head -5
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
12.07% cpu_startup_entry;start_secondary
6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle
11.23% call_cpuidle;cpu_startup_entry;start_secondary
5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter
11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state
#

The user can also select one of "count", "period" or "percent" as the first column.

... and lots of infrastructure enhancements, plus fixes and other changes,
features I failed to list - see the shortlog and the git log for details.

Thanks,

Ingo

-----{ shortlog and diffstat done manually }--------------->

Adrian Hunter (1):
perf evlist: Make perf_evlist__open() open evsels with their cpus and threads (like perf record does)

Andi Kleen (14):
x86: Add an inlined __copy_from_user_nmi() variant
perf/x86: Optimize stack walk user accesses
perf/x86: Add option to disable reading branch flags/cycles
perf/x86: Handle multiple umask bits for BDW CYCLE_ACTIVITY.*
perf/x86/intel: Fix __initconst declaration in the RAPL perf driver
x86/headers: Don't include asm/processor.h in asm/atomic.h
tracepoints: Move struct tracepoint to new tracepoint-defs.h header
x86, tracing, perf: Add trace point for MSR accesses
perf/x86: Remove old MSR perf tracing code
perf evsel: Disable branch flags/cycles for --callgraph lbr
perf/x86: Remove warning for zero PEBS status
perf/x86: Allow zero PEBS status with only single active event
perf/x86: Use INST_RETIRED.TOTAL_CYCLES_PS for cycles:pp for Skylake
perf/x86: Use INST_RETIRED.PREC_DIST for cycles: ppp

Arnaldo Carvalho de Melo (20):
perf test: Fix build of BPF and LLVM on older glibc libraries
tools: Adopt memdup() from tools/perf, moving it to tools/lib/string.c
perf tests: Pass the subtest index to each test routine
perf list: Add support for PERF_COUNT_SW_BPF_OUT
perf list: Robustify event printing routine
perf top: Do show usage message when failing to create cpu/thread maps
Revert "perf tools: Improve setting of gcc debug option"
perf tools: Use same signal handling strategy as 'record'
perf test: Dump the stack when test segfaults when in verbose mode
perf thread: Fix reference count initial state
perf tests: No need to set attr.sample_freq in the perf time to TSC test
perf evlist: Introduce perf_evlist__new_dummy constructor
perf test: Use "dummy" events in the PERF_RECORD_ test
perf test: No need for setting attr.sample_freq on the RECORD test
perf python: Add missing files to binding link list
perf tests: No need to set attr.sample_freq for tracking !PERF_RECORD_SAMPLE
perf tests: Give a bit more information on the CQM test failure path
tools lib: Move find_next_bit.c to tools/lib/
tools lib: Sync tools/lib/find_bit.c with the kernel
tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/

Ekaterina Tumanova (2):
perf symbols: Refactor vmlinux_path__init() to ease path additions
perf symbols: Add the path to vmlinux.debug

Harish Chegondi (3):
perf/x86/intel: Add perf core PMU support for Intel Knights Landing
perf/x86/intel/uncore: Remove hard coding of PMON box control MSR offset
perf/x86/intel/uncore: Add Knights Landing uncore PMU support

He Kuang (3):
perf bpf: Add prologue for BPF programs for fetching arguments
bpf tools: Add helper function for updating bpf maps elements
perf record: Support custom vmlinux path

Huang Rui (1):
perf/x86/rapl: Use unified perf_event_sysfs_show instead of special interface

Ingo Molnar (18):
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/.../acme/linux into perf/core
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/.../acme/linux into perf/core
Merge tag 'perf-core-for-mingo-2' of git://git.kernel.org/.../acme/linux into perf/core
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/.../acme/linux into perf/core
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/.../acme/linux into perf/core
Merge branch 'perf/urgent' into perf/core, to pick up fixes
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/.../acme/linux into perf/core
Merge branch 'perf/urgent' into perf/core, to pick up fixes
perf tui: Change default selection background color to yellow
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/.../acme/linux into perf/core
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/.../acme/linux into perf/core
Merge tag 'v4.4-rc5' into perf/core, to pick up fixes
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/.../acme/linux into perf/core
Merge tag 'perf-core-for-mingo-2.1' of git://git.kernel.org/.../acme/linux into perf/core
Merge branch 'perf/urgent' into perf/core, to make sure a cherry-picked commit does not create conflicts
Merge tag 'perf-core-for-mingo-3' of git://git.kernel.org/.../acme/linux into perf/core
Merge branch 'perf/urgent' into perf/core, to pick up fixes before applying new changes
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/.../acme/linux into perf/core

Jiri Olsa (92):
perf callchain: Move initial entry call into get_entries function
perf callchain: Add order support for libunwind DWARF unwinder
perf test: Add callchain order setup for DWARF unwinder test
perf callchain: Add order support for libdw DWARF unwinder
perf callchain: Add missing parent_val initialization
perf script: Remove default_scripting_ops
perf build: Fix traceevent plugins build race
perf script: Pass perf_script into process_event
tools build: Use fixdep with OUTPUT path prefix
perf stat: Clear sample_(type|period) for counting
perf evlist: Display WEIGHT sample type bit
perf test: 'unwind' test should create kernel maps
perf test: Use machine__new_host in dwarf unwind test
perf test: Use machine__new_host in mmap thread lookup test
perf test: Use machine__new_host in mmap thread code reading test
perf test: Fix cpus and thread maps reference in error path
perf test: Prevent using bpf-output event in round trip name test
perf test: Create kernel maps properly for hist entries test
perf evsel: Use event maps directly in perf_evsel__enable
perf evsel: Introduce disable() method
perf evlist: Factor perf_evlist__(enable|disable) functions
perf stat: Use perf_evlist__enable in handle_initial_delay
perf stat: Create events as disabled
perf stat: Move enable_on_exec setup under earlier code
perf thread_map: Add thread_map user level event
perf thread_map: Add thread_map event sythesize function
perf thread_map: Add thread_map__new_event function
perf thread_map: Add perf_event__fprintf_thread_map function
perf cpu_map: Add cpu_map user level event
perf cpu_map: Add cpu_map event synthesize function
perf cpu_map: Add cpu_map__new_event function
perf cpu_map: Add perf_event__fprintf_cpu_map function
perf tools: Add stat config user level event
perf tools: Add stat config event synthesize function
perf tools: Add stat config event read function
perf tools: Add stat user level event
perf tools: Add stat event synthesize function
perf tools: Add stat event read function
perf tools: Add stat round user level event
perf tools: Add stat round event synthesize function
perf tools: Add stat events fprintf functions
perf tools: Add event_update user level event
perf tools: Add event_update event unit type
perf tools: Add event_update event scale type
perf tools: Add event_update event name type
perf tools: Add event_update event cpus type
perf tools: Add perf_event__fprintf_event_update function
perf report: Display newly added events in raw dump
perf tools: Introduce stat perf.data header feature
perf stat record: Add record command
perf stat record: Initialize record features
perf stat record: Synthesize stat record data
perf evlist: Export id_add_fd()
perf stat record: Store events IDs in perf data file
perf stat record: Add pipe support for record command
perf stat record: Write stat events on record
perf stat record: Write stat round events on record
perf stat record: Do not allow record with multiple runs mode
perf stat record: Synthesize event update events
perf stat report: Add report command
perf stat report: Process cpu/threads maps
perf stat report: Process stat config event
perf stat report: Add support to initialize aggr_map from file
perf stat report: Move csv_sep initialization before report command
perf stat report: Process stat and stat round events
perf stat report: Process event update events
perf stat report: Allow to override aggr_mode
tools build feature: Fix feature_check_display_code typo
tools build feature: Move dwarf post unwind choice output into perf
tools build feature: Introduce feature_assign macro
tools build feature: Use value assignment form for FEATURE-DUMP file
perf build: Use FEATURE-DUMP in bpf subproject
perf stat record: Keep sample_type 0 for pipe session
perf script: Process cpu/threads maps
perf script: Process stat config event
perf script: Add process_stat/process_stat_interval scripting interface
perf script: Add stat default handlers
perf script: Add python support for stat events
perf cpumap: Fix cpu conversion in cpu_map__from_entries
perf script: Display stat events by default
perf script: Add stat-cpi.py script
perf tools: Do not show trace command if it's not compiled in
perf script: Align event name properly
perf tools: Include all tools/lib directory for tags/cscope/TAGS targets
perf tools: Remove list entry from struct sort_entry
perf tools: Add overhead/overhead_children keys defaults via string
perf diff: Use perf_hpp__register_sort_field interface
perf evlist: Remove perf_evlist__(enable|disable)_event functions
perf unwind: Use find_map function in access_dso_mem
perf unwind: Check for mmaps also in MAP__VARIABLE tree
perf libdw: Check for mmaps also in MAP__VARIABLE tree
perf record: Store data mmaps for dwarf unwind

Josh Poimboeuf (22):
perf tools: Remove unused pager_use_color variable
perf tools: Move term functions out of util.c
perf tools: Save cmdline arguments earlier
perf tools: Move cmd_version() to builtin-version.c
perf build: Remove unnecessary line in Makefile.feature
perf test: Add Build file to dependencies for llvm-src-*.c
perf test: Remove tarpkg at end of test
perf build: Fix 'make clean'
perf build: Rename LIB_PATH -> API_PATH
perf tools: Create pager.h
perf tools: Remove check for unused PERF_PAGER_IN_USE
perf tools: Move help_unknown_cmd() to its own file
perf tools: Convert parse-options.c internal functions to static
tools build: Fix feature Makefile issues with 'O='
perf tools: Move strlcpy() from perf to tools/lib/string.c
perf tools: Document the fact that parse_options*() may exit
perf tools: Provide subcmd configuration at runtime
perf tools: Remove subcmd dependencies on strbuf
perf tools: Remove 'perf' from subcmd function and variable names
perf tools: Finalize subcmd independence
perf subcmd: Create subcmd library
tools subcmd: Rename subcmd header include guards

Kan Liang (1):
perf/x86/intel/uncore: Add Broadwell-EP uncore support

Kevin Hilman (1):
tools: Fix selftests_install Makefile rule

Masami Hiramatsu (18):
perf probe: Fix to free temporal Dwarf_Frame
perf machine: Fix machine__findnew_module_map to put registered map
perf machine: Fix machine__destroy_kernel_maps to drop vmlinux_maps references
perf machine: Fix to destroy kernel maps when machine exits
perf tools: Make perf_exec_path() always return malloc'd string
perf tools: Fix to put new map after inserting to map_groups in dso__load_sym
perf tools: Fix __dsos__addnew to put dso after adding it to the list
perf tools: Fix machine__create_kernel_maps to put kernel dso refcount
perf machine: Fix machine__findnew_module_map to put dso
perf probe: Fix to free temporal Dwarf_Frame correctly
perf tools: Fix map_groups__clone to put cloned map
perf stat: Fix cmd_stat to release cpu_map
perf hists: Fix hists_evsel to release hists
perf tools: Fix maps__fixup_overlappings to put used maps
perf machine: Fix machine.vmlinux_maps to make sure to clear the old one
perf tools: Fix write_numa_topology to put cpu_map instead of free
perf tools: Make perf_session__register_idle_thread drop the refcount
perf symbols: Fix dso__load_sym to put dso

Michael Petlan (1):
perf buildid-list: Show running kernel build id fix

Namhyung Kim (48):
perf report: Support folded callchain mode on --stdio
perf callchain: Abstract callchain print function
perf callchain: Add count fields to struct callchain_node
perf report: Add callchain value option
perf hists browser: Factor out hist_browser__show_callchain_list()
perf hists browser: Support flat callchains
perf hists browser: Support folded callchains
perf ui/gtk: Support flat callchains
perf ui/gtk: Support folded callchains
perf callchain: Honor hide_unresolved
perf top: Fix freeze on --call-graph flat/folded
perf report: Show error message when processing sample fails
perf hists: Do not skip elided fields when processing samples
perf hists browser: Update nr entries regardless of min percent
perf annotate: Check argument before calling setup_browser()
perf annotate: Delay UI browser setup after initialization is done
perf kvm: Remove invocation of setup/exit_browser()
perf report: Check argument before calling setup_browser()
perf thread_map: Free strlist on constructor error path
perf tools: Get rid of exit_browser() from usage_with_options()
perf top: Delete half-processed hist entries when exit
perf top: Do not convert address for perf_top__record_precise_ip()
perf top: Access hists->lock only if needed
perf top: Fix annotation on --stdio
perf top: Cleanup condition in perf_top__record_precise_ip()
perf test: Fix hist testcases when kptr_restrict is on
perf record: Add record.build-id config option
perf hist: Pass struct sample to __hists__add_entry()
perf hist: Save raw_data/size for tracepoint events
tools lib traceevent: Factor out and export print_event_field[s]()
perf top: Create the evlist sooner
perf tools: Pass evlist to setup_sorting()
perf tools: Add dynamic sort key for tracepoint events
perf tools: Try to show pretty printed output for dynamic sort keys
perf tools: Add 'trace' sort key
perf report/top: Add --raw-trace option
perf tools: Support shortcuts for events in dynamic sort keys
perf tools: Support '<event>.*' dynamic sort key
perf tools: Skip dynamic fields not defined for current event
perf tools: Add 'trace_fields' dynamic sort key
perf tools: Make 'trace' or 'trace_fields' sort key default for tracepoint events
perf tools: Add all matching dynamic sort keys for field name
perf report: Add documentation for dynamic sort keys
perf top: Decay periods in callchains
perf report: Change default to use event group view
perf hists: Export a couple of hist functions
perf report: Show random usage tip on the help line
perf evlist: Add --trace-fields option to show trace fields

Peter Zijlstra (2):
perf/core: Collapse common IPI pattern
perf/core: Collapse more IPI loops

Russell King (1):
perf annotate: ARM support

Stephane Eranian (5):
perf/x86: Fix filter_events() bug with event mappings
perf/x86: Fix LBR related crashes on Intel Atom
perf/x86: fix PEBS issues on Intel Atom/Core2
perf/x86: Enable cycles:pp for Intel Atom
perf pmu: fix alias->snapshot missing initialization bug

Steven Rostedt (1):
tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines

Taeung Song (2):
perf tools: Add 'perf config' command
perf config: Add initial man page

Takao Indoh (2):
perf/x86/intel/pt: Add interface to stop Intel PT logging
perf, x86: Stop Intel PT before kdump starts

Vince Weaver (1):
perf/x86/amd: Remove l1-dcache-stores event for AMD

Wang Nan (28):
tools: Clone the kernel's strtobool function
bpf tools: Load a program with different instances using preprocessor
perf bpf: Add BPF_PROLOGUE config options for further patches
perf bpf: Compile dwarf-regs.c if CONFIG_BPF_PROLOGUE is on
perf bpf: Allow BPF program attach to uprobe events
perf bpf: Allow attaching BPF programs to modules symbols
perf bpf: Allow BPF program config probing options
perf bpf: Generate prologue for BPF programs
perf test: Test the BPF prologue adding infrastructure
perf test: Fix 'perf test BPF' when it fails to find a suitable vmlinux
perf bpf: Use same BPF program if arguments are identical
perf test: Print result for each LLVM subtest
perf test: Print result for each BPF subtest
perf test: Mute test cases error messages if verbose == 0
tools build: Clean CFLAGS and LDFLAGS for fixdep
tools lib bpf: Don't do a feature check when cleaning
perf machine: Adjust dso->long_name for offline module
tools lib bpf: Collect map definition in bpf_object
tools lib bpf: Extract and collect map names from BPF object file
perf bpf: Rename bpf config to program config
perf machine: Pass correct string to dso__adjust_kmod_long_name
tools lib bpf: Check return value of strdup when reading map names
tools lib bpf: Fetch map names from correct strtab
perf data: Add u32_hex data type
perf script: Add support for PERF_TYPE_BREAKPOINT
perf tools: Clear struct machine during machine__init()
perf tools: Make options always available, even if required libs not linked
perf tools: Add missing headers in perf's MANIFEST

Yannick Brosseau (1):
perf tools: Correctly identify anon_hugepage when generating map (v2)


Documentation/trace/events-msr.txt | 37 ++
Documentation/trace/postprocess/decode_msr.py | 37 ++
arch/x86/include/asm/atomic.h | 1 -
arch/x86/include/asm/atomic64_32.h | 1 -
arch/x86/include/asm/intel_pt.h | 10 +
arch/x86/include/asm/msr-trace.h | 57 ++
arch/x86/include/asm/msr.h | 31 +
arch/x86/include/asm/uaccess.h | 9 +
arch/x86/kernel/cpu/perf_event.c | 36 +-
arch/x86/kernel/cpu/perf_event.h | 21 +-
arch/x86/kernel/cpu/perf_event_amd.c | 2 +-
arch/x86/kernel/cpu/perf_event_intel.c | 115 +++-
arch/x86/kernel/cpu/perf_event_intel_ds.c | 39 +-
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 42 +-
arch/x86/kernel/cpu/perf_event_intel_pt.c | 9 +
arch/x86/kernel/cpu/perf_event_intel_rapl.c | 25 +-
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 17 +
arch/x86/kernel/cpu/perf_event_intel_uncore.h | 3 +
arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c | 2 +-
.../x86/kernel/cpu/perf_event_intel_uncore_snbep.c | 635 ++++++++++++++++++-
arch/x86/kernel/crash.c | 11 +
arch/x86/lib/msr.c | 26 +
include/linux/tracepoint-defs.h | 27 +
include/linux/tracepoint.h | 16 +-
include/uapi/linux/perf_event.h | 6 +
kernel/events/core.c | 321 ++++------
lib/atomic64_test.c | 4 +
tools/Makefile | 2 +-
tools/build/Makefile | 2 +-
tools/build/Makefile.feature | 44 +-
tools/build/Makefile.include | 2 +-
tools/build/feature/Makefile | 93 +--
tools/{perf/util => }/include/linux/bitmap.h | 2 +
tools/include/linux/string.h | 15 +
tools/{perf/util => lib}/bitmap.c | 0
tools/lib/bpf/Makefile | 14 +
tools/lib/bpf/bpf.c | 14 +
tools/lib/bpf/bpf.h | 2 +
tools/lib/bpf/libbpf.c | 412 ++++++++++---
tools/lib/bpf/libbpf.h | 88 +++
tools/lib/find_bit.c | 84 +++
tools/lib/string.c | 89 +++
tools/lib/subcmd/Build | 7 +
tools/lib/subcmd/Makefile | 48 ++
tools/lib/subcmd/exec-cmd.c | 209 +++++++
tools/lib/subcmd/exec-cmd.h | 16 +
tools/{perf/util => lib/subcmd}/help.c | 179 ++----
tools/{perf/util => lib/subcmd}/help.h | 13 +-
tools/{perf/util => lib/subcmd}/pager.c | 23 +-
tools/lib/subcmd/pager.h | 9 +
tools/{perf/util => lib/subcmd}/parse-options.c | 250 ++++++--
tools/{perf/util => lib/subcmd}/parse-options.h | 26 +-
tools/{perf/util => lib/subcmd}/run-command.c | 24 +-
tools/{perf/util => lib/subcmd}/run-command.h | 12 +-
tools/{perf/util => lib/subcmd}/sigchain.c | 3 +-
tools/{perf/util => lib/subcmd}/sigchain.h | 6 +-
tools/lib/subcmd/subcmd-config.c | 11 +
tools/lib/subcmd/subcmd-config.h | 14 +
tools/lib/subcmd/subcmd-util.h | 91 +++
tools/lib/traceevent/event-parse.c | 134 ++--
tools/lib/traceevent/event-parse.h | 4 +
tools/lib/util/find_next_bit.c | 89 ---
tools/perf/Build | 8 +-
tools/perf/Documentation/perf-config.txt | 103 ++++
tools/perf/Documentation/perf-evlist.txt | 3 +
tools/perf/Documentation/perf-record.txt | 24 +-
tools/perf/Documentation/perf-report.txt | 41 +-
tools/perf/Documentation/perf-stat.txt | 34 ++
tools/perf/Documentation/perf-top.txt | 3 +
tools/perf/Documentation/tips.txt | 14 +
tools/perf/MANIFEST | 7 +-
tools/perf/Makefile.perf | 44 +-
tools/perf/arch/x86/include/arch-tests.h | 8 +-
tools/perf/arch/x86/tests/insn-x86.c | 2 +-
tools/perf/arch/x86/tests/intel-cqm.c | 4 +-
tools/perf/arch/x86/tests/perf-time-to-tsc.c | 3 +-
tools/perf/arch/x86/tests/rdpmc.c | 2 +-
tools/perf/arch/x86/util/Build | 1 +
tools/perf/arch/x86/util/intel-bts.c | 4 +-
tools/perf/arch/x86/util/intel-pt.c | 6 +-
tools/perf/bench/futex-hash.c | 2 +-
tools/perf/bench/futex-lock-pi.c | 2 +-
tools/perf/bench/futex-requeue.c | 2 +-
tools/perf/bench/futex-wake-parallel.c | 2 +-
tools/perf/bench/futex-wake.c | 2 +-
tools/perf/bench/mem-functions.c | 2 +-
tools/perf/bench/numa.c | 2 +-
tools/perf/bench/sched-messaging.c | 2 +-
tools/perf/bench/sched-pipe.c | 2 +-
tools/perf/builtin-annotate.c | 44 +-
tools/perf/builtin-bench.c | 2 +-
tools/perf/builtin-buildid-cache.c | 2 +-
tools/perf/builtin-buildid-list.c | 4 +-
tools/perf/builtin-config.c | 66 ++
tools/perf/builtin-data.c | 2 +-
tools/perf/builtin-diff.c | 15 +-
tools/perf/builtin-evlist.c | 13 +-
tools/perf/builtin-help.c | 10 +-
tools/perf/builtin-inject.c | 2 +-
tools/perf/builtin-kmem.c | 2 +-
tools/perf/builtin-kvm.c | 5 +-
tools/perf/builtin-list.c | 2 +-
tools/perf/builtin-lock.c | 2 +-
tools/perf/builtin-mem.c | 2 +-
tools/perf/builtin-probe.c | 17 +-
tools/perf/builtin-record.c | 48 +-
tools/perf/builtin-report.c | 52 +-
tools/perf/builtin-sched.c | 2 +-
tools/perf/builtin-script.c | 245 ++++++--
tools/perf/builtin-stat.c | 679 ++++++++++++++++++++-
tools/perf/builtin-timechart.c | 2 +-
tools/perf/builtin-top.c | 75 ++-
tools/perf/builtin-trace.c | 4 +-
tools/perf/builtin-version.c | 10 +
tools/perf/builtin.h | 1 +
tools/perf/command-list.txt | 3 +-
tools/perf/config/Makefile | 24 +-
tools/perf/config/utilities.mak | 19 -
tools/perf/perf.c | 24 +-
tools/perf/scripts/python/stat-cpi.py | 77 +++
tools/perf/tests/.gitignore | 1 +
tools/perf/tests/Build | 16 +-
tools/perf/tests/attr.c | 6 +-
tools/perf/tests/bp_signal.c | 2 +-
tools/perf/tests/bp_signal_overflow.c | 2 +-
tools/perf/tests/bpf-script-test-prologue.c | 35 ++
tools/perf/tests/bpf.c | 93 ++-
tools/perf/tests/builtin-test.c | 141 ++++-
tools/perf/tests/code-reading.c | 16 +-
tools/perf/tests/cpumap.c | 88 +++
tools/perf/tests/dso-data.c | 6 +-
tools/perf/tests/dwarf-unwind.c | 37 +-
tools/perf/tests/event_update.c | 117 ++++
tools/perf/tests/evsel-roundtrip-name.c | 5 +-
tools/perf/tests/evsel-tp-sched.c | 2 +-
tools/perf/tests/fdarray.c | 4 +-
tools/perf/tests/hists_common.c | 6 +-
tools/perf/tests/hists_cumulate.c | 10 +-
tools/perf/tests/hists_filter.c | 4 +-
tools/perf/tests/hists_link.c | 10 +-
tools/perf/tests/hists_output.c | 12 +-
tools/perf/tests/keep-tracking.c | 5 +-
tools/perf/tests/kmod-path.c | 2 +-
tools/perf/tests/llvm.c | 75 ++-
tools/perf/tests/llvm.h | 2 +
tools/perf/tests/make | 3 +-
tools/perf/tests/mmap-basic.c | 2 +-
tools/perf/tests/mmap-thread-lookup.c | 8 +-
tools/perf/tests/openat-syscall-all-cpus.c | 2 +-
tools/perf/tests/openat-syscall-tp-fields.c | 2 +-
tools/perf/tests/openat-syscall.c | 2 +-
tools/perf/tests/parse-events.c | 2 +-
tools/perf/tests/parse-no-sample-id-all.c | 2 +-
tools/perf/tests/perf-record.c | 8 +-
tools/perf/tests/pmu.c | 2 +-
tools/perf/tests/python-use.c | 3 +-
tools/perf/tests/sample-parsing.c | 2 +-
tools/perf/tests/stat.c | 111 ++++
tools/perf/tests/sw-clock.c | 2 +-
tools/perf/tests/switch-tracking.c | 8 +-
tools/perf/tests/task-exit.c | 2 +-
tools/perf/tests/tests.h | 95 +--
tools/perf/tests/thread-map.c | 45 +-
tools/perf/tests/thread-mg-share.c | 2 +-
tools/perf/tests/topology.c | 2 +-
tools/perf/tests/vmlinux-kallsyms.c | 2 +-
tools/perf/ui/browser.c | 2 +-
tools/perf/ui/browsers/hists.c | 335 +++++++++-
tools/perf/ui/gtk/hists.c | 152 ++++-
tools/perf/ui/hist.c | 14 +-
tools/perf/ui/stdio/hist.c | 100 ++-
tools/perf/util/Build | 28 +-
tools/perf/util/annotate.c | 23 +
tools/perf/util/auxtrace.c | 2 +-
tools/perf/util/bpf-loader.c | 433 ++++++++++++-
tools/perf/util/bpf-loader.h | 4 +
tools/perf/util/bpf-prologue.c | 455 ++++++++++++++
tools/perf/util/bpf-prologue.h | 34 ++
tools/perf/util/build-id.c | 2 +-
tools/perf/util/cache.h | 14 +-
tools/perf/util/callchain.c | 164 ++++-
tools/perf/util/callchain.h | 30 +-
tools/perf/util/cgroup.c | 2 +-
tools/perf/util/color.c | 2 +-
tools/perf/util/config.c | 2 +-
tools/perf/util/cpumap.c | 51 ++
tools/perf/util/cpumap.h | 1 +
tools/perf/util/data-convert-bt.c | 2 +
tools/perf/util/dso.c | 2 +
tools/perf/util/env.c | 9 -
tools/perf/util/environment.c | 8 -
tools/perf/util/event.c | 308 ++++++++++
tools/perf/util/event.h | 150 ++++-
tools/perf/util/evlist.c | 112 ++--
tools/perf/util/evlist.h | 10 +-
tools/perf/util/evsel.c | 53 +-
tools/perf/util/evsel.h | 4 +-
tools/perf/util/exec_cmd.c | 148 -----
tools/perf/util/exec_cmd.h | 12 -
tools/perf/util/generate-cmdlist.sh | 15 +
tools/perf/util/header.c | 207 ++++++-
tools/perf/util/header.h | 17 +
tools/perf/util/help-unknown-cmd.c | 103 ++++
tools/perf/util/help-unknown-cmd.h | 0
tools/perf/util/hist.c | 118 +++-
tools/perf/util/hist.h | 24 +-
tools/perf/util/include/linux/string.h | 3 -
tools/perf/util/intel-pt.c | 4 +-
tools/perf/util/machine.c | 74 ++-
tools/perf/util/map.c | 7 +-
tools/perf/util/parse-branch-options.c | 2 +-
tools/perf/util/parse-events.c | 10 +-
tools/perf/util/parse-regs-options.c | 2 +-
tools/perf/util/path.c | 18 -
tools/perf/util/pmu.c | 1 +
tools/perf/util/probe-event.c | 7 +-
tools/perf/util/probe-finder.c | 6 +-
tools/perf/util/python-ext-sources | 2 +
.../util/scripting-engines/trace-event-python.c | 115 +++-
tools/perf/util/session.c | 200 +++++-
tools/perf/util/session.h | 2 +-
tools/perf/util/sort.c | 601 +++++++++++++++++-
tools/perf/util/sort.h | 14 +-
tools/perf/util/stat.c | 62 ++
tools/perf/util/stat.h | 10 +
tools/perf/util/string.c | 16 -
tools/perf/util/symbol-elf.c | 9 +-
tools/perf/util/symbol.c | 66 +-
tools/perf/util/symbol.h | 4 +-
tools/perf/util/term.c | 35 ++
tools/perf/util/term.h | 10 +
tools/perf/util/thread.c | 10 +-
tools/perf/util/thread_map.c | 28 +
tools/perf/util/thread_map.h | 3 +
tools/perf/util/tool.h | 8 +-
tools/perf/util/trace-event.h | 4 +
tools/perf/util/unwind-libdw.c | 63 +-
tools/perf/util/unwind-libdw.h | 2 +
tools/perf/util/unwind-libunwind.c | 80 ++-
tools/perf/util/util.c | 67 +-
tools/perf/util/util.h | 20 +-
241 files changed, 9168 insertions(+), 1908 deletions(-)