Re: [RFC][PATCH 3/9] perf: export registerred pmus via sysfs

From: Ingo Molnar
Date: Mon May 10 2010 - 07:49:10 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Mon, 2010-05-10 at 13:27 +0200, Peter Zijlstra wrote:
> > On Mon, 2010-05-10 at 18:26 +0800, Lin Ming wrote:
> >
> > > > No, I'm assuming there is only 1 PMU per CPU. Corey is the expert on
> > > > crazy hardware though, but I think the sanest way is to extend the CPU
> > > > topology if there's more structure to it.
> > >
> > > But our goal is to support multiple pmus, don't we need to assume there
> > > are more than 1 PMU per CPU?
> >
> > No, because as I said, then its ambiguous what pmu you want. If you have
> > that, you need to extend your topology information.
> >
> > Anyway, I talked with Ingo on this and he'd like to see this somewhat
> > extended.
> >
> > Instead of a pmu_id field, which we pass into a new
> > perf_event_attr::pmu_id field, how about creating an event_source sysfs
> > class. Then each class can have an event_source_id and a hierarchy of
> > 'generic' events.
> >
> > We'd start using the PERF_TYPE_ space for this and express the
> > PERF_COUNT_ space in the event attributes found inside that class.
> >
> > That way we can include all the existing event enumerations into this as
> > well.
> >
> > This way we can create:
> >
> > /sys/devices/system/cpu/cpuN/cpu_hardware_events
> > cpu_hardware_events/event_source_id
> > cpu_hardware_events/cpu_cycles
> > cpu_hardware_events/instructions
> > /...
> >
> > /sys/devices/system/cpu/cpuN/cpu_raw_events
> > cpu_raw_events/event_source_id
> >
> >
> > These would match the current PERF_TYPE_* values for compatibility
> >
> > For new PMUs we can start a dynamic range of PERF_TYPE_ (say at 64k but
> > that's not ABI and can be changed at any time, we've got u32 to play
> > with).
> >
> > For uncore this would result in:
> >
> > /sys/devices/system/node/nodeN/node_raw_events
> > node_raw_events/event_source_id
> >
> > and maybe:
> >
> > /sys/devices/system/node/nodeN/node_events
> > node_events/event_source_id
> > node_events/local_misses
> > /local_hits
> > /remote_misses
> > /remote_hits
> > /...
> >
> >
> > The software events and tracepoints and kprobes stuff we could hang off
> > of /sys/kernel/ or something
>
> The GPU folks would hang is off of the drm class or maybe next to it in
> the PCI space.

It could conceivably be in multiple places as well - a given event makes sense
to enumerate in multiple places.

( For example an 'interrupt' might show up in a given GPU - but it can also
show up amongst the IRQ tracepoints - or something like that. )

But by far the most common case would be for an event source to be attached to
one particular place in the sysfs topology.

Note how naturally this scheme extends to all things hardware topology - which
is already enumerated in sysfs. It also extends to all things software events
in a pretty natural way via /sys/kernel/mm/.

Plus we want to move out the /sys/kernel/debug/ hacks for kprobes and
tracepoints into this space as well. (possibly do it with hw-breakpoints as
well by attaching them to the CPU directory - for completeness)

That way /sys/class/event_source/ would provide an enumeration of all events
to 'perf list' and would automatically be usable by all the perf tooling.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/