Re: [PATCH] perf stat: Hide invalid uncore event output for aggr mode

From: Ian Rogers
Date: Wed Jan 25 2023 - 12:33:53 EST


On Wed, Jan 25, 2023 at 12:12 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> The current display code for perf stat iterates given cpus and build the
> aggr map to collect the event data for the aggregation mode.
>
> But uncore events have their own cpu maps and it won't guarantee that
> it'd match to the aggr map. For example, per-package uncore events
> would generate a single value for each socket. When user asks per-core
> aggregation mode, the output would contain 0 values for other cores.
>
> Thus it needs to check the uncore PMU's cpumask and if it matches to the
> current aggregation id.
>
> Before:
> $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1
>
> Performance counter stats for 'system wide':
>
> S0-D0-C0 1 3.73 Joules power/energy-pkg/
> S0-D0-C1 0 <not counted> Joules power/energy-pkg/
> S0-D0-C2 0 <not counted> Joules power/energy-pkg/
> S0-D0-C3 0 <not counted> Joules power/energy-pkg/
>
> 1.001404046 seconds time elapsed
>
> Some events weren't counted. Try disabling the NMI watchdog:
> echo 0 > /proc/sys/kernel/nmi_watchdog
> perf stat ...
> echo 1 > /proc/sys/kernel/nmi_watchdog
>
> The core 1, 2 and 3 should not be printed because the event is handled
> in a cpu in the core 0 only. With this change, the output becomes like
> below.
>
> After:
> $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1
>
> Performance counter stats for 'system wide':
>
> S0-D0-C0 1 2.09 Joules power/energy-pkg/
>
> Fixes: b89761351089 ("perf stat: Update event skip condition for system-wide per-thread mode and merged uncore and hybrid events")

Thanks! Tested further with mixed core and uncore events like:
$ perf stat -A -a -e power/energy-pkg/,cycles sleep 1
and the "<not counted>" are now nicely gone.

> Cc: Athira Jajeev <atrajeev@xxxxxxxxxxxxxxxxxx>
> Cc: Michael Petlan <mpetlan@xxxxxxxxxx>
> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> ---
> tools/perf/util/stat-display.c | 37 ++++++++++++++++++++++++++++------
> 1 file changed, 31 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 8bd8b0142630..b9dcb13650d0 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -787,6 +787,22 @@ static void uniquify_counter(struct perf_stat_config *config, struct evsel *coun
> uniquify_event_name(counter);
> }
>
> +static bool check_uncore_event_aggr(struct perf_stat_config *config,
> + struct evsel *counter,
> + struct aggr_cpu_id *id)

nit: const *s
nit: check_uncore_event_aggr isn't particularly intention revealing,
perhaps something more like should_print_counter. Perhaps some
kernel-doc like:
/**
* should_print_counter() - Based on id, should the current counter's
value be printed.
* @config: The perf stat configuration with knowledge of the aggregation mode.
* @counter: The counter with its associated cpumap.
* @id: The aggregation type that is being queried for printing.
*
* An evlist can have evsels with different cpumaps, for example, by
mixing core and uncore events.
* When displaying one counter the other counter shouldn't be printed.
Check the counter's cpumap
* to see whether for any CPU it is associated with the counter should
be printed.
*
* Return: true for counters that should be printed.
*/

> +{
> + struct perf_cpu cpu;
> + int idx;
> +
> + perf_cpu_map__for_each_cpu(cpu, idx, counter->core.own_cpus) {

I know this is pre-existing, sorry to whine. I think we need to
document cpus vs own_cpus in the perf_evsel. Normally own_cpus ==
cpus, but in a case like this the difference matters and I have a hard
time understanding what "own" is supposed to be conveying, and why
here we don't use cpus. I also lose what the connection is with
perf_evlist all_cpus, does that union own_cpus or cpus? At least there
is a comment there :-D Honestly, why do we need to even have 2 cpu
maps in an evsel?

> + struct aggr_cpu_id own_id = config->aggr_get_id(config, cpu);
> +
> + if (aggr_cpu_id__equal(id, &own_id))
> + return true;
> + }
> + return false;
> +}
> +
> static void print_counter_aggrdata(struct perf_stat_config *config,
> struct evsel *counter, int s,
> struct outstate *os)
> @@ -814,12 +830,21 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
> ena = aggr->counts.ena;
> run = aggr->counts.run;
>
> - /*
> - * Skip value 0 when enabling --per-thread globally, otherwise it will
> - * have too many 0 output.
> - */
> - if (val == 0 && config->aggr_mode == AGGR_THREAD && config->system_wide)
> - return;
> + if (val == 0) {
> + /*
> + * Skip value 0 when enabling --per-thread globally,
> + * otherwise it will have too many 0 output.
> + */
> + if (config->aggr_mode == AGGR_THREAD && config->system_wide)
> + return;
> + /*
> + * Skip value 0 when it's an uncore event and the given aggr id
> + * does not belong to the PMU cpumask.
> + */
> + if (counter->core.requires_cpu &&
> + !check_uncore_event_aggr(config, counter, &id))
> + return;
> + }
>
> if (!metric_only) {
> if (config->json_output)
> --
> 2.39.1.405.gd4c25cc71f-goog
>