Re: [PATCH v2] perf/arm: adjust hwevents mappings on boot

From: Will Deacon
Date: Fri May 06 2022 - 09:10:04 EST


On Thu, Mar 24, 2022 at 11:14:58AM -0700, Stephane Eranian wrote:
> The mapping of perf_events generic hardware events to actual PMU events on
> ARM PMUv3 may not always be correct. This is in particular true for the
> PERF_COUNT_HW_BRANCH_INSTRUCTIONS event. Although the mapping points to an
> architected event, it may not always be available. This can be seen with a
> simple:
>
> $ perf stat -e branches sleep 0
> Performance counter stats for 'sleep 0':
>
> <not supported> branches
>
> 0.001401081 seconds time elapsed
>
> Yet the hardware does have an event that could be used for branches.
> This patch fixes the problem by dynamically validating the generic hardware
> events against the supported architected events. If a mapping is wrong it
> can be replaced it with another. This is done for the event above at boot time
> and the kernel will log the remapping:
>
> armv8_pmuv3: hwevent HW_BRANCH_INSTRUCTIONS remapped from 0xc to 0x21
>
> And with that:
>
> $ perf stat -e branches sleep 0
>
> Performance counter stats for 'sleep 0':
>
> 166,739 branches
>
> 0.000832163 seconds time elapsed
>
> Signed-off-by: Stephane Eranian <eranian@xxxxxxxxxx>
> ---
> arch/arm64/kernel/perf_event.c | 41 +++++++++++++++++++++++++++++++++-
> 1 file changed, 40 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index cab678ed6618..d438f5a46bdc 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -39,7 +39,7 @@
> * be supported on any given implementation. Unsupported events will
> * be disabled at run-time based on the PMCEID registers.
> */
> -static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
> +static unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
> PERF_MAP_ALL_UNSUPPORTED,
> [PERF_COUNT_HW_CPU_CYCLES] = ARMV8_PMUV3_PERFCTR_CPU_CYCLES,
> [PERF_COUNT_HW_INSTRUCTIONS] = ARMV8_PMUV3_PERFCTR_INST_RETIRED,
> @@ -1222,6 +1222,42 @@ static void armv8_pmu_register_sysctl_table(void)
> register_sysctl("kernel", armv8_pmu_sysctl_table);
> }
>
> +static void armv8pmu_fixup_perf_map(struct arm_pmu *cpu_pmu)
> +{
> + int i, code;
> + unsigned *map = armv8_pmuv3_perf_map;
> +
> + for (i = 0; i < PERF_COUNT_HW_MAX; i++) {
> +retry:
> + code = map[i];
> + if (code == HW_OP_UNSUPPORTED)
> + continue;
> +
> + if (test_bit(map[i], cpu_pmu->pmceid_bitmap))
> + continue;
> + /*
> + * mapping does not exist,
> + * let's see if we can fix it
> + */
> + switch (i) {
> + case PERF_COUNT_HW_BRANCH_INSTRUCTIONS:
> + if (code == ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED) {
> + map[i] = ARMV8_PMUV3_PERFCTR_BR_RETIRED;
> + pr_info("armv8_pmuv3: hwevent "
> + "HW_BRANCH_INSTRUCTIONS remapped "
> + " from 0x%x to 0x%x\n", code, map[i]);
> + goto retry;
> + }
> + break;
> + default:
> + pr_info("armv8_pmuv3: hwevent %d not supported\n", i);

If a CPU supports neither ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED nor
ARMV8_PMUV3_PERFCTR_BR_RETIRED, won't we get a funny series of messages
here? I think I'd prefer to drop the prints altogether.

Will