Re: [PATCH v2 5/9] libperf: Add support for user space counter access

From: Rob Herring
Date: Wed Sep 02 2020 - 15:48:34 EST


On Wed, Sep 2, 2020 at 12:07 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>
> On Fri, Aug 28, 2020 at 1:56 PM Rob Herring <robh@xxxxxxxxxx> wrote:
> >
> > x86 and arm64 can both support direct access of event counters in
> > userspace. The access sequence is less than trivial and currently exists
> > in perf test code (tools/perf/arch/x86/tests/rdpmc.c) with copies in
> > projects such as PAPI and libpfm4.
> >
> > In order to support usersapce access, an event must be mmapped. While
> > there's already mmap support for evlist, the usecase is a bit different
> > than the self monitoring with userspace access. So let's add a new
> > perf_evsel__mmap() function to mmap an evsel. This allows implementing
> > userspace access as a fastpath for perf_evsel__read().
> >
> > The mmapped address is returned by perf_evsel__mmap() primarily for
> > users/tests to check if userspace access is enabled.
> >
> > Signed-off-by: Rob Herring <robh@xxxxxxxxxx>
> > ---

> > +int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count)
> > +{
> > + struct perf_event_mmap_page *pc = map->base;
> > + u32 seq, idx, time_mult = 0, time_shift = 0;
> > + u64 cnt, cyc = 0, time_offset = 0, time_cycles = 0, time_mask = ~0ULL;
> > +
> > + BUG_ON(!pc);
> > +
> > + if (!pc->cap_user_rdpmc)
> > + return -1;
> > +
> > + do {
> > + seq = READ_ONCE(pc->lock);
> > + barrier();
> > +
> > + count->ena = READ_ONCE(pc->time_enabled);
> > + count->run = READ_ONCE(pc->time_running);
> > +
> > + if (pc->cap_user_time && count->ena != count->run) {
> > + cyc = read_timestamp();
> > + time_mult = READ_ONCE(pc->time_mult);
> > + time_shift = READ_ONCE(pc->time_shift);
> > + time_offset = READ_ONCE(pc->time_offset);
> > +
> > + if (pc->cap_user_time_short) {
> > + time_cycles = READ_ONCE(pc->time_cycles);
> > + time_mask = READ_ONCE(pc->time_mask);
> > + }
> > + }
> > +
> > + idx = READ_ONCE(pc->index);
> > + cnt = READ_ONCE(pc->offset);
> > + if (pc->cap_user_rdpmc && idx) {
> > + u64 evcnt = read_perf_counter(idx - 1);
> > + u16 width = READ_ONCE(pc->pmc_width);
> > +
> > + evcnt <<= 64 - width;
> > + evcnt >>= 64 - width;
> > + cnt += evcnt;
> > + } else
> > + return -1;
> > +
> > + barrier();
> > + } while (READ_ONCE(pc->lock) != seq);
> > +
> > + if (count->ena != count->run) {
>
> There's an existing bug here that I tried to resolve in this patch:
> https://lore.kernel.org/lkml/CAP-5=fVRdqvswtyQMg5cB+ntTGda+SAYskjTQednEH-AeZo13g@xxxxxxxxxxxxxx/
> Due to multiplexing, enabled may be > 0 but run == 0 and the divide
> below can end up with divide by zero.

Yeah, I saw that, but didn't try to also fix that issue here.

> I like the idea of this code being in a library, there's an intent
> that the perf_event.h and test code be copy-paste-able, but there is
> some pre-existing divergence. It would be nice if this code could be
> closer to the sample code in both the test and perf_event.h.

The only way we get and keep all the versions of the code aligned is
removing the other copies. We should just remove the code comment from
perf_event.h IMO. If rdpmc.c is going to stick around given some
resistance to removing it, then perhaps it should be converted to use
libperf. At that point it could also be arch independent. Though I
don't like the idea of having the same test twice.

> As per the change above, I think running and enabled times need to be
> out arguments.

They are now in this version.

Rob