Re: [PATCH] perf/core: Add a tracepoint for perf sampling

From: Brendan Gregg
Date: Fri Jul 29 2016 - 15:56:20 EST


On Fri, Jul 29, 2016 at 12:21 PM, Arnaldo Carvalho de Melo
<acme@xxxxxxxxxx> wrote:
> Em Tue, Jul 19, 2016 at 11:20:48PM +0000, Brendan Gregg escreveu:
>> When perf is performing hrtimer-based sampling, this tracepoint can be used
>> by BPF to run additional logic on each sample. For example, BPF can fetch
>> stack traces and frequency count them in kernel context, for an efficient
>> profiler.
>
> Could you provide a complete experience? I.e. together with this patch a
> bpf script that could then run, with the full set of steps needed to
> show it in use.

There's currently profile.py, in bcc, which will either use this
tracepoint or use a kprobe if it doesn't exist (although the kprobe is
unreliable). profile samples stack traces and shows stack traces with
their occurrence counts. Eg:

# ./profile
Sampling at 49 Hertz of all threads by user + kernel stack... Hit Ctrl-C to end.
^C
ffffffff81189249 filemap_map_pages
ffffffff811bd3f5 handle_mm_fault
ffffffff81065990 __do_page_fault
ffffffff81065caf do_page_fault
ffffffff817ce228 page_fault
00007fed989afcc0 [unknown]
- cp (9036)
1
[...]

ffffffff8105eb66 native_safe_halt
ffffffff8103659e default_idle
ffffffff81036d1f arch_cpu_idle
ffffffff810bba5a default_idle_call
ffffffff810bbd07 cpu_startup_entry
ffffffff817bf4a7 rest_init
ffffffff81d65f58 start_kernel
ffffffff81d652db x86_64_start_reservations
ffffffff81d65418 x86_64_start_kernel
- swapper/0 (0)
72

ffffffff8105eb66 native_safe_halt
ffffffff8103659e default_idle
ffffffff81036d1f arch_cpu_idle
ffffffff810bba5a default_idle_call
ffffffff810bbd07 cpu_startup_entry
ffffffff8104df55 start_secondary
- swapper/1 (0)
75

Tool and examples are on github [1][2]. Is this sufficient for this
patch? If not, I could rewrite something for samples/bpf (eg, an IP
sampler, or a task priority sampler), which I may do anyway as a
follow-on if they turned out to be nice examples.

>
> Also, what would be the value when BPF is not used?
>

No big reason comes to mind. I could imagine it might be useful when
debugging perf's sampling behavior, and there might be uses with
ftrace as well. But the big reason is extending perf's existing
sampling capabilities for in-kernel frequency counts of stack traces
(which could include custom BPF-based stack walkers), IP, task
priority, etc. Thanks,

Brendan

[1] https://github.com/iovisor/bcc/blob/master/tools/profile.py
[2] https://github.com/iovisor/bcc/blob/master/tools/profile_example.txt