Re: [PATCH v3] perf/arm: adjust hwevents mappings on boot

From: Mark Rutland
Date: Tue Aug 16 2022 - 10:18:29 EST


On Tue, Aug 16, 2022 at 03:02:21PM +0200, Peter Newman wrote:
> From: Stephane Eranian <eranian@xxxxxxxxxx>
>
> The mapping of perf_events generic hardware events to actual PMU events on
> ARM PMUv3 may not always be correct. This is in particular true for the
> PERF_COUNT_HW_BRANCH_INSTRUCTIONS event. Although the mapping points to an
> architected event, it may not always be available. This can be seen with a
> simple:
>
> $ perf stat -e branches sleep 0
> Performance counter stats for 'sleep 0':
>
> <not supported> branches
>
> 0.001401081 seconds time elapsed
>
> Yet the hardware does have an event that could be used for branches. This
> patch fixes the problem by dynamically validating the generic hardware
> events against the supported architected events. If a mapping is wrong it
> can be replaced it with another. This is done for the event above at boot
> time.
>
> And with that:
>
> $ perf stat -e branches sleep 0
>
> Performance counter stats for 'sleep 0':
>
> 166,739 branches
>
> 0.000832163 seconds time elapsed
>
> Signed-off-by: Stephane Eranian <eranian@xxxxxxxxxx>
> Co-developed-by: Peter Newman <peternewman@xxxxxxxxxx>
> Signed-off-by: Peter Newman <peternewman@xxxxxxxxxx>
> ---
>
> v2: https://lore.kernel.org/lkml/20220324181458.3216262-1-eranian@xxxxxxxxxx/
>
> since v2, removed prints per Will's suggestion
>
> arch/arm64/kernel/perf_event.c | 36 +++++++++++++++++++++++++++++++++-
> 1 file changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index cb69ff1e6138..945c31e3f3e3 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -39,7 +39,7 @@
> * be supported on any given implementation. Unsupported events will
> * be disabled at run-time based on the PMCEID registers.

> */
> -static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
> +static unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
> PERF_MAP_ALL_UNSUPPORTED,
> [PERF_COUNT_HW_CPU_CYCLES] = ARMV8_PMUV3_PERFCTR_CPU_CYCLES,
> [PERF_COUNT_HW_INSTRUCTIONS] = ARMV8_PMUV3_PERFCTR_INST_RETIRED,

On big.LITTLE systems this is array is shared by multiple PMUs, so this cannot
be altered based on a single PMU.

Rather than applying a fixup, could we special-case this at mapping time?

Does the following work for you?

Thanks
Mark.

---->8----