Re: [RFC PATCH 2/9] perf: Add ability to dump user regs

From: Frederic Weisbecker
Date: Wed Oct 20 2010 - 12:14:04 EST


On Wed, Oct 20, 2010 at 11:24:42AM +0200, Stephane Eranian wrote:
> On Tue, Oct 19, 2010 at 12:35 AM, Frederic Weisbecker
> <fweisbec@xxxxxxxxx> wrote:
> > On Mon, Oct 18, 2010 at 12:01:18PM +0200, Stephane Eranian wrote:
> >> On Sun, Oct 17, 2010 at 12:07 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >> > On Sat, 2010-10-16 at 00:58 +0200, Frederic Weisbecker wrote:
> >> >> > Yes, PEBS does not capture the entire state.
> >> >> >
> >> >> > Here is what you get on Intel Core:
> >> >> >         u64 flags, ip;
> >> >> >         u64 ax, bx, cx, dx;
> >> >> >         u64 si, di, bp, sp;
> >> >> >         u64 r8,  r9,  r10, r11;
> >> >> >         u64 r12, r13, r14, r15;
> >> >
> >> >> Ok, that seems to cover most of the state. I guess few people care
> >> >> about cs, ds, es, fs, gs, most of the time.
> >> >
> >> > Yeah, except if you want to profile wine or something like that ;-)
> >> >
> >> That means that if you want the segment registers, then you cannot
> >> use PEBS. I think you could catch that when the event is created.
> >>
> >> The other problem here is how to name registers at the API level.
> >> You would be introducing architecture-specific register names
> >> in perf_event.h. There is no such a thing today.
> >
> >
> > That can go into an asm/perf_regs.h or something. It's up to the
> > arch to name its registers.
> >
> I am fine with that.
>
> Starting with Nehalem, there is a PEBS mode where HW captures
> not just actual register state but also information about cache misses
> such as the data address, miss latency, data source. Those are
> stored in the PEBS record as u64. I believe we could also expose
> this thru this register bitmask mechanism. Of course, you'd get a
> failure if PEBS is not programmed correctly.



I'm not sure the registers are the right place for that. This is
too oriented toward a specific mechanism.

I would rather put that into a PERF_SAMPLE_RAW dump or a specific pebs
sample.

The problem with PERF_SAMPLE_RAW is that perf tools always think
it's trace event content. It should look at what event it is
looking at before making that assumption.

We'd need to look at the event that triggered the sample to
interpret the sample raw. That's fixable.



> The alternative would be to invent yet another generic abstraction
> to sample cache misses. Note that PEBS cache miss sampling
> cannot be attached to an existing generic cache miss event. It
> uses a dedicated event which does not count all cache misses.

Then perhaps that should be abstracted into a different event yeah.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/