[PATCH 0/2] perf stat: add per-core count aggregation

From: Stephane Eranian
Date: Tue Feb 12 2013 - 09:10:36 EST


This patch series contains improvement to the aggregation support
in perf stat.

First, the aggregation code is refactored and a aggr_mode enum
is defined. There is also an important bug fix for the existing
per-socket aggregation.

Second, the patch adds a new --aggr-core option to perf stat.
It aggregates counts per physical core and becomes useful on
systems with hyper-threading. The cores are presented per
socket: S0-C1, means socket 0 core 1. Note that the core number
represents its physical core id. As such, numbers may not always
be contiguous. All of this is based on topology information available
in sysfs.

Per-core aggregation can be combined with interval printing:

# perf stat -a --aggr-core -I 1000 -e cycles sleep 100
# time core cpus counts events
1.000101160 S0-C0 2 6,051,254,899 cycles
1.000101160 S0-C1 2 6,379,230,776 cycles
1.000101160 S0-C2 2 6,480,268,471 cycles
1.000101160 S0-C3 2 6,110,514,321 cycles
2.000663750 S0-C0 2 6,572,533,016 cycles
2.000663750 S0-C1 2 6,378,623,674 cycles
2.000663750 S0-C2 2 6,264,127,589 cycles
2.000663750 S0-C3 2 6,305,346,613 cycles

For instance here on this SNB machine, we can see that the load
is evenly balanced across all 4 physical core (HT is on).

Signed-off-by: Stephane Eranian <eranian@xxxxxxxxxx>

-
Stephane Eranian (2):
perf stat: refactor aggregation code
perf stat: add per-core aggregation

tools/perf/Documentation/perf-stat.txt | 6 +
tools/perf/builtin-stat.c | 237 ++++++++++++++++++++------------
tools/perf/util/cpumap.c | 86 ++++++++++--
tools/perf/util/cpumap.h | 12 ++
4 files changed, 239 insertions(+), 102 deletions(-)

--
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/