Re: Perf event to counter mapping question

From: Peter Zijlstra
Date: Thu Feb 23 2023 - 03:27:20 EST


On Wed, Feb 22, 2023 at 04:28:36PM -0800, Atish Patra wrote:

> AFAIK, ARM64 allows all-to-all mapping in pmuv3[1]. That makes life
> much easier. It just needs to pick the next available counter.
> On the other hand, x86 allows selective counter mapping which is
> discovered from the json file and maintained in per event
> constraints[4].

All the contraint management is done in kernel, and yes, it's a giant
pain in the rear side.

>From what I understand the reason for these contraints is complexity of
implementation, less constraints is more 'wires' in the hardware.

With PMU use being ever more popular, we're seeing the x86 PMU move
towards less constraints -- although I don't think we'll ever get rid of
them :/

> 2. Mandate all-to-all mapping similar to ARM64.

If at all possible, I would strongly recommend taking this route. Yes,
the hardware people will complain, but newer x86 hardware having less,
or simpler, constraints might be sufficient to convince them.

(and if you do have to do contraints, please take a lesson from x86 and
*never* allow overlapping contraints as AMD had, solving those
constraints is not fun)

As you note, this is *much* simpler to program and virtualize.

> Note: This is only for programmable counters. If the platform supports
> any fixed counters (i.e. can monitor
> only a specific event), that needs to be provisioned via some other
> method. IIRC the fixed counters(apart from cycle) in ARM64 are part of
> AMU not PMU.

So free running counters are ideal and fairly simple to multiplex/use.

The moment you start adding overflow interrupts / filters and any other
complexities to fixed function counters it becomes a mess (look at the
x86 PMU again).