Re: [PATCH 2/2] perf list: do not print metrics on s390 zvm systems

From: Ian Rogers
Date: Tue Mar 19 2024 - 12:37:33 EST


On Tue, Mar 19, 2024 at 2:56 AM Thomas Richter <tmricht@xxxxxxxxxxxxx> wrote:
>
> On s390 z/VM virtual machines command perf list also displays metrics:
>
> # ./perf list | grep -A 20 'Metric Groups:'
> Metric Groups:
>
> No_group:
> cpi
> [Cycles per Instruction]
> est_cpi
> [Estimated Instruction Complexity CPI infinite Level 1]
> finite_cpi
> [Cycles per Instructions from Finite cache/memory]
> l1mp
> [Level One Miss per 100 Instructions]
> l2p
> [Percentage sourced from Level 2 cache]
> l3p
> [Percentage sourced from Level 3 on same chip cache]
> l4lp
> [Percentage sourced from Level 4 Local cache on same book]
> l4rp
> [Percentage sourced from Level 4 Remote cache on different book]
> memp
> [Percentage sourced from memory]
> ....
> #
>
> This is not correct, on s390 z/VM virtual machines the referenced
> CPU Counter Measurement facility does not exist. The command
>
> # ./perf stat -M cpi -- true
> event syntax error: '{CPU_CYCLES/metric-id=CPU_CYCLES/.....'
> \___ Bad event or PMU
>
> Unable to find PMU or event on a PMU of 'CPU_CYCLES'
>
> event syntax error: '{CPU_CYCLES/metric-id=CPU_CYCLES/...'
> \___ Cannot find PMU `CPU_CYCLES'.
> Missing kernel support?
> #
>
> fails.
>
> Perf list should not display the metrics when the referenced
> CPU Counter Measurement PMU is not available.
>
> Output after:
> # ./perf list | grep -A 20 'Metric Groups:'
> #
>
> Fixes: 7f76b3113068 ("perf list: Add IBM z16 event description for s390")
> Signed-off-by: Thomas Richter <tmricht@xxxxxxxxxxxxx>
> Acked-by: Sumanth Korikkar <sumanthk@xxxxxxxxxxxxx>
> ---
> tools/perf/arch/s390/util/pmu.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/tools/perf/arch/s390/util/pmu.c b/tools/perf/arch/s390/util/pmu.c
> index 886c30e001fa..2dc27acc860c 100644
> --- a/tools/perf/arch/s390/util/pmu.c
> +++ b/tools/perf/arch/s390/util/pmu.c
> @@ -8,6 +8,7 @@
> #include <string.h>
>
> #include "../../../util/pmu.h"
> +#include "../../../util/pmus.h"
>
> #define S390_PMUPAI_CRYPTO "pai_crypto"
> #define S390_PMUPAI_EXT "pai_ext"
> @@ -20,3 +21,19 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
> !strcmp(pmu->name, S390_PMUCPUM_CF))
> pmu->selectable = true;
> }
> +
> +const struct pmu_metrics_table *pmu_metrics_table__find(void)
> +{
> + struct perf_pmu *pmu;
> +
> + /*
> + * Metrics defined on events from PMU cpum_cf aren't supported
> + * on z/VM. Make sure the PMU exists and return NULL if that
> + * PMU cannot be found.
> + */
> + pmu = perf_pmus__find("cpum_cf");
> + if (pmu)
> + return perf_pmu__find_metrics_table(pmu);
> +
> + return NULL;
> +}

Hi Thomas,

Thank you for this, it is good the patch addresses the issue but I'm
less keen on the approach. The approach is to disable all metrics when
the PMU cpum_cf isn't present.

Similar to the metric cpi there is a hard coded IPC metric:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/stat-shadow.c?h=perf-tools-next#n314
I'd like to delete this code :-) The hard coded metrics magically fire
when certain events are present. This causes issues with code that
processes the output as some metrics can accidentally fire unasked for
additional hard coded metrics. The hard coded metrics also don't use
groups for events, etc. I do like they have more color options
(thresholds) than the json metrics.

Using python to generate metrics achieves a lot of simplification and
hopefully lowers the bar for new metrics being written. Having metrics
disabled when a PMU isn't present means that the only way to get the
metric could be to hard code it - something I want to delete :-) There
are metrics like cycles (looking for review/unmerged):
https://lore.kernel.org/lkml/20240314055801.1973422-3-irogers@xxxxxxxxxx/
that don't use sysfs or json events and that could work on s390. I
think ideally we'd work to enable this.

(1) For the cpi metric here, presumably CPU_CYCLES is used to avoid
the legacy cpu-cycles event. Would using cpu-cycles fix the metric
(assuming legacy events work in a virtual machine) ? An aside, it is
kind of gross that the event names are so similar as event names
aren't case sensitive. As sysfs/json events now have priority over
legacy events, using the name cpu-cycles is a possibility (x86 uses
this name).

(2) Another option is that in the metric you can use the "has_event"
function. So:
"MetricExpr": "CPU_CYCLES / INSTRUCTIONS"
becomes something like :
"MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) if has_event(CPU_CYCLES) else 0"
This has been the existing approach taken.

(3) We could also make "perf list" not show metrics when the events
aren't present. We do something similar for legacy cache events:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n306
This could complicate testing as we iterate through all metrics
expecting them to work. An event name typo would hide the metric in
the list from and we'd miss the typo in testing.

So I think solution (2) is best, wdyt?

Thanks,
Ian


> --
> 2.44.0
>