Re: [PATCH v2] perf/arm: adjust hwevents mappings on boot

From: Stephane Eranian
Date: Fri May 06 2022 - 18:23:59 EST


On Fri, May 6, 2022 at 6:09 AM Will Deacon <will@xxxxxxxxxx> wrote:
>
> On Thu, Mar 24, 2022 at 11:14:58AM -0700, Stephane Eranian wrote:
> > The mapping of perf_events generic hardware events to actual PMU events on
> > ARM PMUv3 may not always be correct. This is in particular true for the
> > PERF_COUNT_HW_BRANCH_INSTRUCTIONS event. Although the mapping points to an
> > architected event, it may not always be available. This can be seen with a
> > simple:
> >
> > $ perf stat -e branches sleep 0
> > Performance counter stats for 'sleep 0':
> >
> > <not supported> branches
> >
> > 0.001401081 seconds time elapsed
> >
> > Yet the hardware does have an event that could be used for branches.
> > This patch fixes the problem by dynamically validating the generic hardware
> > events against the supported architected events. If a mapping is wrong it
> > can be replaced it with another. This is done for the event above at boot time
> > and the kernel will log the remapping:
> >
> > armv8_pmuv3: hwevent HW_BRANCH_INSTRUCTIONS remapped from 0xc to 0x21
> >
> > And with that:
> >
> > $ perf stat -e branches sleep 0
> >
> > Performance counter stats for 'sleep 0':
> >
> > 166,739 branches
> >
> > 0.000832163 seconds time elapsed
> >
> > Signed-off-by: Stephane Eranian <eranian@xxxxxxxxxx>
> > ---
> > arch/arm64/kernel/perf_event.c | 41 +++++++++++++++++++++++++++++++++-
> > 1 file changed, 40 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index cab678ed6618..d438f5a46bdc 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -39,7 +39,7 @@
> > * be supported on any given implementation. Unsupported events will
> > * be disabled at run-time based on the PMCEID registers.
> > */
> > -static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
> > +static unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
> > PERF_MAP_ALL_UNSUPPORTED,
> > [PERF_COUNT_HW_CPU_CYCLES] = ARMV8_PMUV3_PERFCTR_CPU_CYCLES,
> > [PERF_COUNT_HW_INSTRUCTIONS] = ARMV8_PMUV3_PERFCTR_INST_RETIRED,
> > @@ -1222,6 +1222,42 @@ static void armv8_pmu_register_sysctl_table(void)
> > register_sysctl("kernel", armv8_pmu_sysctl_table);
> > }
> >
> > +static void armv8pmu_fixup_perf_map(struct arm_pmu *cpu_pmu)
> > +{
> > + int i, code;
> > + unsigned *map = armv8_pmuv3_perf_map;
> > +
> > + for (i = 0; i < PERF_COUNT_HW_MAX; i++) {
> > +retry:
> > + code = map[i];
> > + if (code == HW_OP_UNSUPPORTED)
> > + continue;
> > +
> > + if (test_bit(map[i], cpu_pmu->pmceid_bitmap))
> > + continue;
> > + /*
> > + * mapping does not exist,
> > + * let's see if we can fix it
> > + */
> > + switch (i) {
> > + case PERF_COUNT_HW_BRANCH_INSTRUCTIONS:
> > + if (code == ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED) {
> > + map[i] = ARMV8_PMUV3_PERFCTR_BR_RETIRED;
> > + pr_info("armv8_pmuv3: hwevent "
> > + "HW_BRANCH_INSTRUCTIONS remapped "
> > + " from 0x%x to 0x%x\n", code, map[i]);
> > + goto retry;
> > + }
> > + break;
> > + default:
> > + pr_info("armv8_pmuv3: hwevent %d not supported\n", i);
>
> If a CPU supports neither ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED nor
> ARMV8_PMUV3_PERFCTR_BR_RETIRED, won't we get a funny series of messages
> here? I think I'd prefer to drop the prints altogether.
>
Ok, let me clean this up.

> Will