Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

From: Brendan Gregg
Date: Tue Nov 03 2015 - 16:34:18 EST


On Tue, Nov 3, 2015 at 6:40 AM, Arnaldo Carvalho de Melo
<arnaldo.melo@xxxxxxxxx> wrote:
> Em Tue, Nov 03, 2015 at 09:52:07PM +0900, Namhyung Kim escreveu:
>> Hello,
>>
>> This is what Brendan requested on the perf-users mailing list [1] to
>> support FlameGraphs [2] more efficiently. This patchset adds a few
>> more callchain options to adjust the output for it.
>>
>> * changes in v4)
>> - add missing doc update
>> - cleanup/fix callchain value print code
>> - add Acked-by from Brendan and Jiri
>
> Do those Acked-by stand? Things changed, the values moved from the end
> of the line to the start, etc.
>
[...]

I'd Ack this change as it's a useful addition. It doesn't quite
address the folded-only output, but it's a step in that direction. I
think having the value at the start of a line only makes sense for the
perf report output containing the hist summary lines, for consistency.

Here's how I'd shuffle the output of this patch (ignore word wrap
issues with this email):

# ./perf report --stdio -g folded,count,caller -F pid | \
awk '/^ / { n = $1 }
/^[0-9]/ { split(n,a,":"); print a[2] "-" a[1] ";" $2,$1 }'
swapper-0;cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
809
swapper-0;xen_start_kernel;x86_64_start_reservations;start_kernel;rest_init;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
135
dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;check_events;xen_hypercall_xen_version
63
dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf
54
dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;memset_erms
3
dd-30551;xen_irq_enable_direct_end;check_events;xen_hypercall_xen_version 3

So the output is folded stacks, prefixed by comm-PID. Shuffling the
summarized output is a lot better than doing a "perf script" dump and
re-processing call chains. (Note that since I'm using -F, I didn't
need --no-children; and with "-g count", I didn't need
--show-nr-samples.)

I notice the fields (-F) option already has this precedent:

- "comm": prints PID:comm
- "pid": prints PID

If these were added to -g, along with a no-hists, then the two types
of folded-only output could be generated using:

perf report --stdio -g folded,count,comm,no-hists,caller
perf report --stdio -g folded,count,pid,no-hists,caller

... although "no-hists" doesn't hit me as intuitive. How about "-F
none" to specify zero columns? ie:

perf report --stdio -g folded,count,comm,caller -F none
perf report --stdio -g folded,count,pid,caller -F none

Brendan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/