Re: [tip:perf/diff] perf diff: Introduce tool to show performancedifference

From: Ingo Molnar
Date: Tue Dec 15 2009 - 04:52:55 EST



'perf diff' looks to be very useful!

I think we want to improve the default output some more:

For example there should be column names like for perf report:

# Overhead Command Shared Object Symbol
# ........ ......... ............................ ......
#
13.87% hackbench [kernel] [k] _raw_spin_lock
5.60% hackbench [kernel] [k] copy_user_generic_string
5.23% hackbench [kernel] [k] __cache_free

Secondly, here's the current output of two successive 'hackbench 10' run:

1 +2381 +58 [kernel.kallsyms] avc_has_perm_noaudit
2 -26 [kernel.kallsyms] clear_page_c
3 +107 +29 [kernel.kallsyms] _raw_spin_lock
4 +106 +23 [kernel.kallsyms] _raw_spin_lock
5 +105 +22 [kernel.kallsyms] _raw_spin_lock
6 +104 +21 [kernel.kallsyms] _raw_spin_lock
7 +103 +20 [kernel.kallsyms] _raw_spin_lock
8 +102 +20 [kernel.kallsyms] _raw_spin_lock
9 +101 +19 [kernel.kallsyms] _raw_spin_lock
10 -6 -28 [kernel.kallsyms] selinux_task_wait
11 +99 +18 [kernel.kallsyms] _raw_spin_lock
12 +98 +18 [kernel.kallsyms] _raw_spin_lock
13 +97 +18 [kernel.kallsyms] _raw_spin_lock
14 +96 +18 [kernel.kallsyms] _raw_spin_lock
15 +220 +20 [kernel.kallsyms] _raw_spin_lock_irqsave
16 +94 +17 [kernel.kallsyms] _raw_spin_lock
17 +93 +17 [kernel.kallsyms] _raw_spin_lock
18 +92 +16 [kernel.kallsyms] _raw_spin_lock
19 +9648 +30 [kernel.kallsyms] copy_page_c
20 +90 +15 [kernel.kallsyms] _raw_spin_lock
21 +89 +15 [kernel.kallsyms] _raw_spin_lock

beyond the missing column names, there's some other things visible too:

- the symbol names are per pid, so they repeat all over again. I think we
want the default output to be like perf report's, i.e. PIDs get summarized
over commands.

Furthermore, i think -p should be enabled by default. Especially with a lot of
functions the +/- notation isnt very obvious at first sight.

I.e. output like this would be more useful i think:

baseline delta command symbol
-----------------------------------------------------------
13.87% +0.11% hackbench [k] _raw_spin_lock
5.60% -2.53% hackbench [k] copy_user_generic_string
5.23% -0.14% hackbench [k] __cache_free
4.30% -2.59% hackbench [k] unix_stream_recvmsg
3.35% -1.56% hackbench [k] avc_has_perm_noaudit
-----------------------------------------------------------
Sum:
0.51% +- 0.05% less time elapsed

The relative position can be printed too but should not necessarily be the
default. Printing the delta in elapsed time is also very important, because
not only do we want to know the after/before profile delta, we also want to
see the tangible result in elapsed time - via the same tool.

Btw., i think we also want a "--repeat N" feature for perf record, to allow
the recording of the same command over and over again, so that a higher
quality profile can be achieved.

"perf record -f --repeat 10 cmd" would be a convenience shortcut for:

rm -f perf.data
for ((i=0; i<10; i++)); do
perf record -A cmd
done

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/