[tip:perf/core] perf tools: Document --children option in more detail

From: tip-bot for Namhyung Kim
Date: Tue May 05 2015 - 23:11:54 EST


Commit-ID: dd3092075c3263fca7a3e98b143ad2296aab71b4
Gitweb: http://git.kernel.org/tip/dd3092075c3263fca7a3e98b143ad2296aab71b4
Author: Namhyung Kim <namhyung@xxxxxxxxxx>
AuthorDate: Wed, 22 Apr 2015 15:33:45 +0900
Committer: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
CommitDate: Wed, 29 Apr 2015 10:38:06 -0300

perf tools: Document --children option in more detail

As the --children option changes the output of perf report (and perf
top) it sometimes confuses users. Add more words and examples to help
understanding of the option's behavior - and how to disable it ;-).

Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
Reviewed-by: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: David Ahern <dsahern@xxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Cc: Taeung Song <treeze.taeung@xxxxxxxxx>
Link: http://lkml.kernel.org/r/1429684425-14987-1-git-send-email-namhyung@xxxxxxxxxx
Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
---
.../callchain-overhead-calculation.txt | 108 +++++++++++++++++++++
tools/perf/Documentation/perf-report.txt | 4 +
tools/perf/Documentation/perf-top.txt | 3 +-
3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/callchain-overhead-calculation.txt b/tools/perf/Documentation/callchain-overhead-calculation.txt
new file mode 100644
index 0000000..1a75792
--- /dev/null
+++ b/tools/perf/Documentation/callchain-overhead-calculation.txt
@@ -0,0 +1,108 @@
+Overhead calculation
+--------------------
+The overhead can be shown in two columns as 'Children' and 'Self' when
+perf collects callchains. The 'self' overhead is simply calculated by
+adding all period values of the entry - usually a function (symbol).
+This is the value that perf shows traditionally and sum of all the
+'self' overhead values should be 100%.
+
+The 'children' overhead is calculated by adding all period values of
+the child functions so that it can show the total overhead of the
+higher level functions even if they don't directly execute much.
+'Children' here means functions that are called from another (parent)
+function.
+
+It might be confusing that the sum of all the 'children' overhead
+values exceeds 100% since each of them is already an accumulation of
+'self' overhead of its child functions. But with this enabled, users
+can find which function has the most overhead even if samples are
+spread over the children.
+
+Consider the following example; there are three functions like below.
+
+-----------------------
+void foo(void) {
+ /* do something */
+}
+
+void bar(void) {
+ /* do something */
+ foo();
+}
+
+int main(void) {
+ bar()
+ return 0;
+}
+-----------------------
+
+In this case 'foo' is a child of 'bar', and 'bar' is an immediate
+child of 'main' so 'foo' also is a child of 'main'. In other words,
+'main' is a parent of 'foo' and 'bar', and 'bar' is a parent of 'foo'.
+
+Suppose all samples are recorded in 'foo' and 'bar' only. When it's
+recorded with callchains the output will show something like below
+in the usual (self-overhead-only) output of perf report:
+
+----------------------------------
+Overhead Symbol
+........ .....................
+ 60.00% foo
+ |
+ --- foo
+ bar
+ main
+ __libc_start_main
+
+ 40.00% bar
+ |
+ --- bar
+ main
+ __libc_start_main
+----------------------------------
+
+When the --children option is enabled, the 'self' overhead values of
+child functions (i.e. 'foo' and 'bar') are added to the parents to
+calculate the 'children' overhead. In this case the report could be
+displayed as:
+
+-------------------------------------------
+Children Self Symbol
+........ ........ ....................
+ 100.00% 0.00% __libc_start_main
+ |
+ --- __libc_start_main
+
+ 100.00% 0.00% main
+ |
+ --- main
+ __libc_start_main
+
+ 100.00% 40.00% bar
+ |
+ --- bar
+ main
+ __libc_start_main
+
+ 60.00% 60.00% foo
+ |
+ --- foo
+ bar
+ main
+ __libc_start_main
+-------------------------------------------
+
+In the above output, the 'self' overhead of 'foo' (60%) was add to the
+'children' overhead of 'bar', 'main' and '\_\_libc_start_main'.
+Likewise, the 'self' overhead of 'bar' (40%) was added to the
+'children' overhead of 'main' and '\_\_libc_start_main'.
+
+So '\_\_libc_start_main' and 'main' are shown first since they have
+same (100%) 'children' overhead (even though they have zero 'self'
+overhead) and they are the parents of 'foo' and 'bar'.
+
+Since v3.16 the 'children' overhead is shown by default and the output
+is sorted by its values. The 'children' overhead is disabled by
+specifying --no-children option on the command line or by adding
+'report.children = false' or 'top.children = false' in the perf config
+file.
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 4879cf6..896672b 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -193,6 +193,7 @@ OPTIONS
Accumulate callchain of children to parent entry so that then can
show up in the output. The output will have a new "Children" column
and will be sorted on the data. It requires callchains are recorded.
+ See the `overhead calculation' section for more details.

--max-stack::
Set the stack depth limit when parsing the callchain, anything
@@ -323,6 +324,9 @@ OPTIONS
--header-only::
Show only perf.data header (forces --stdio).

+
+include::callchain-overhead-calculation.txt[]
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-annotate[1]
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 3265b10..9e5b07eb 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -168,7 +168,7 @@ Default is to monitor all CPUS.
Accumulate callchain of children to parent entry so that then can
show up in the output. The output will have a new "Children" column
and will be sorted on the data. It requires -g/--call-graph option
- enabled.
+ enabled. See the `overhead calculation' section for more details.

--max-stack::
Set the stack depth limit when parsing the callchain, anything
@@ -234,6 +234,7 @@ INTERACTIVE PROMPTING KEYS

Pressing any unmapped key displays a menu, and prompts for input.

+include::callchain-overhead-calculation.txt[]

SEE ALSO
--------
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/