Re: [RFC][PATCH 6/6] perf, tools: X86 RDPMC, RDTSC test
From: Peter Zijlstra
Date:  Mon Nov 21 2011 - 11:59:36 EST
On Mon, 2011-11-21 at 16:37 +0100, Peter Zijlstra wrote:
> On Mon, 2011-11-21 at 16:29 +0100, Stephane Eranian wrote:
> > Peter,
> > 
> > I don't see how this test and infrastructure handles the case where the event
> > is multiplexed. I know there is time_enabled and time_running. But those are
> > not sync'd to the moment of the rdpmc(). I think there needs to be some other
> > timestamp in the mmap struct so the user can compute a delta to then add to
> > time_enabled and time_running.
> 
> When the counter isn't actually on the PMU, ->index will be 0 and rdpmc
> should not be attempted.
> 
> > Unless, we assume the two time metrics are there ONLY to compute a scaling
> > ratio. In which case, I think, we don't need the delta because if we
> > can do rdpmc()
> > it means the event is currently scheduled and thus time_enabled and time_running
> > are both ticking which means the scaling ratio does not change since the moment
> > the event was scheduled in.
> 
> Right, you don't need delta to compute the scale, but its useful for
> user-space time based measurements, Arun wanted to do something like
> that.
I'm full of crap, of course that makes a difference :-)
Even when both running and enabled are incremented, the scaling does
still change: 3/2 != 4/3 etc..
Using that we can actually deal with the whole multiplexing thing
without ever having to fall back to read(), something like:
static u64 mmap_read_self(void *addr)
{
        struct perf_event_mmap_page *pc = addr;
        u32 seq, idx, time_mult, time_shift;
        u64 count, cyc, time_offset, enabled, running, delta;
        do {
                seq = pc->lock;
                barrier();
                enabled = pc->time_enabled;
                running = pc->time_running;
                if (enabled != running) {
                        cyc = rdtsc();
                        time_mult = pc->time_mult;
                        time_shift = pc->time_shift;
                        time_offset = pc->time_offset;
                } 
                idx = pc->index;
                count = pc->offset;
                if (idx)
                        count += rdpmc(idx - 1);
                barrier();
        } while (pc->lock != seq);
        if (enabled != running) {
                u64 quot, rem;
                quot = (cyc >> time_shift);
                rem = cyc & ((1 << time_shift) - 1);
                delta = time_offset + quot * time_mult + 
                        ((rem * time_mult) >> time_shift);
                
                enabled += delta;
                if (idx)
                        running += delta;
                
                quot = count / running;
                rem = count % running;
                count = quot * enabled + (rem * enabled) / running;
        }
        return count;
}
Now all I need to do is make sure pc->offset actually makes sense,
because currently it looks like we're off by a factor
event->hw.prev_count when idx is set.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/