[RFC 00/10] perf: Add cputime events/metrics

From: Jiri Olsa
Date: Wed Jun 06 2018 - 18:15:33 EST


hi,
so.. I failed to make work reliably the exclude_idle bit for
cpu-clock event using the idle's process sum_exec_runtime as
Peter outlined in his patch [1]. The time jumped up and down
and I couldn't make it stable.

But I noticed we actually have IDLE stats (and many more)
available for each CPU (enum cpu_usage_stat), we just can't
reach them by perf yet.

So this patchset adds 'cputime' perf software PMU, that provides
CPUTIME_* stats via events that mirrors their names:

# perf list | grep cputime
cputime/guest/ [Kernel PMU event]
cputime/guest_nice/ [Kernel PMU event]
cputime/idle/ [Kernel PMU event]
cputime/iowait/ [Kernel PMU event]
cputime/irq/ [Kernel PMU event]
cputime/nice/ [Kernel PMU event]
cputime/softirq/ [Kernel PMU event]
cputime/steal/ [Kernel PMU event]
cputime/system/ [Kernel PMU event]
cputime/user/ [Kernel PMU event]

I had some issues with IDLE counter being miscounted due to stopping
of the idle tick. I tried to solve it in this patch (it's part of the
patchset):
perf/cputime: Don't stop idle tick if there's live cputime event

but I'm pretty sure it's wrong and there's better solution.


However most of the counts look ok so far and here's few
of my favorite commands I've been playing with:

# perf stat --top -I 1000
# time Idle System User Irq Softirq IO wait
1.001692690 100.0% 0.0% 0.0% 0.7% 0.2% 0.0%
2.002994039 98.9% 0.0% 0.0% 0.9% 0.2% 0.0%
3.004164038 98.5% 0.2% 0.2% 0.9% 0.2% 0.0%
4.005312773 98.9% 0.0% 0.0% 0.9% 0.2% 0.0%


# perf stat --top-full -I 1000
# time Idle System User Irq Softirq IO wait Guest Guest nice Nice Steal
1.001750803 100.0% 0.0% 0.0% 0.7% 0.2% 0.0% 0.0% 0.0% 0.0% 0.0%
2.003159490 99.0% 0.0% 0.0% 0.9% 0.2% 0.0% 0.0% 0.0% 0.0% 0.0%
3.004358366 99.0% 0.0% 0.0% 0.9% 0.2% 0.0% 0.0% 0.0% 0.0% 0.0%
4.005592436 98.9% 0.0% 0.0% 0.9% 0.2% 0.0% 0.0% 0.0% 0.0% 0.0%


# perf stat -e cpu-clock,cputime/system/,cputime/user/,cputime/idle/ -a sleep 10

Performance counter stats for 'system wide':

240070.828221 cpu-clock (msec) # 23.999 CPUs utilized
208,910,979,120 ns cputime/system/ # 87.0% System
20,589,603,359 ns cputime/user/ # 8.6% User
8,813,416,821 ns cputime/idle/ # 3.7% Idle

10.003261054 seconds time elapsed


# perf stat -e cpu-clock,cputime/system/,cputime/user/ yes > /dev/null
^Cyes: Interrupt

Performance counter stats for 'yes':

3483.824364 cpu-clock (msec) # 1.000 CPUs utilized
2,460,117,205 ns cputime/system/ # 70.6% System
1,018,360,669 ns cputime/user/ # 29.2% User

3.484554149 seconds time elapsed

1.018525000 seconds user
2.460515000 seconds sys

# perf stat --top -I 1000 --interval-clear
# perf stat --top -I 1000 --interval-clear --per-core
# perf stat --top -I 1000 --interval-clear --per-socket
# perf stat --top -I 1000 --interval-clear -A

It's also available in here:
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
perf/fixes

My current plan is now to read those counters in perf top/record/report
to show (at least) the idle percentage for the current profile.

thoughts? ;-)

thanks,
jirka


[1] https://marc.info/?l=linux-kernel&m=152397251027433&w=2
---
Jiri Olsa (10):
perf tools: Uniquify the event name if there's no other matched event
perf tools: Fix error index for pmu event parser
perf stat: Add --interval-clear option
perf stat: Use only color_fprintf call in print_metric_only
perf stat: Fix metric column display
perf stat: Allow to specify specific metric column len
perf stat: Add event parsing error handling to add_default_attributes
perf/cputime: Add cputime pmu
perf/cputime: Don't stop idle tick if there's live cputime event
perf stat: Add cputime metric support

include/linux/perf_event.h | 3 ++
kernel/events/Makefile | 2 +-
kernel/events/core.c | 1 +
kernel/events/cputime.c | 211 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
kernel/time/tick-sched.c | 4 +++
tools/perf/Documentation/perf-stat.txt | 68 ++++++++++++++++++++++++++++++++++++
tools/perf/builtin-stat.c | 104 ++++++++++++++++++++++++++++++++++++++++++++-----------
tools/perf/util/parse-events.y | 5 +++
tools/perf/util/stat-shadow.c | 70 +++++++++++++++++++++++++++++++++++++
tools/perf/util/stat.c | 10 ++++++
tools/perf/util/stat.h | 10 ++++++
11 files changed, 467 insertions(+), 21 deletions(-)
create mode 100644 kernel/events/cputime.c