Re: [PATCH/RFC 2/2] perfcounters: add an mmap method to allowuserspace to read hardware counters

From: Peter Zijlstra
Date: Tue Mar 17 2009 - 04:42:19 EST


On Tue, 2009-03-17 at 19:27 +1100, Paul Mackerras wrote:
> Peter Zijlstra writes:
>
> > While I think mmap'ed counters is a great idea, I really dont like this
> > patch. It adds a second output format unrelated to the regular output
> > format, and it doesn't appear to honor the regular output rules either.
> > PERF_RECORD_GROUP thingies won't work for example.
> >
> > Nor is there any kind of queuing, one might want to have multiple events
> > in the mmap buffer..
>
> I think you have misunderstood. This is not about sampling counters
> *at all*. This is about simple counting counters.

I think I did indeed.

> On powerpc, userspace can read the hardware counters directly. This
> stuff lets a program that is counting hardware events on itself do
> that and translate the result into a full 64-bit value. The
> information the program needs in order to do this is (a) which
> hardware counter (if any) has been assigned to this particular
> perf_counter and (b) what the offset between the hardware counter
> value and the full 64-bit perf_counter value is. That, plus a
> seqlock-style lock, is what's in the mmapped page.

Ah, right. I think some of the intel chips can do similar things with
rdpmc instructions.

> > I was planning to do this after cleaning up the normal output bits, as
> > our current output stuff is a mess:
> > - its spread out over arch code (seems daft to me, we should all output
> > the same)
> > - its useless for pretty much anything but the two apps we currently
> > have
> >
> > In particular, it lacks the tid information for sampled data I hinted to
> > in the previous email.
>
> Ingo has talked about reusing some of the tracing infrastructure for
> reporting perf_counter events to userspace. That sounds like an
> excellent idea to me, and that is why I didn't bother with putting the
> event queue into the mmapped page at this stage. If it makes sense to
> add it, it can be added later.

Yeah, I've been looking into that, but so far I'm a bit at a loss, all
that tracing stuff is per-cpu, and that's massive overkill for us, since
we're dealing with single cpu streams.

One worry though, supposedly we want to mmap() such buffers too at some
point, how would that interact with that you proposed?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/