Re: [PATCH] perf record: Enable PERF_SAMPLE_ID when samplingmultiple events

From: Ingo Molnar
Date: Fri Oct 23 2009 - 02:18:51 EST



* Anton Blanchard <anton@xxxxxxxxx> wrote:

>
> Hi Ingo,
>
> > > If we are sampling multiple events we need the id in each sample so we
> > > can differentiate between them in a perf data file.
> >
> > Wondering, what are you (or will you be) using this for?
>
> I put together a simple python library for parsing perf.data files:
>
> http://ozlabs.org/~anton/junkcode/perf_event.py
>
> An example of using it is here:
>
> http://ozlabs.org/~anton/junkcode/perf_event_example.py
>
> Only tested on powerpc so far, but it should work on x86. It's still
> missing bits but it has been useful for finding some corner cases in
> perf_event. It should also make it easy to post process complex
> profiles with multiple events in them.

Ah, cool!

Note, there's a related development: we are working on script extensions
to perf, in a built-in way. It can be found in this patch series from
Tom on lkml:

[RFC][PATCH 0/9] perf trace: support for general-purpose scripting

Tom started with Perl support - Python could be another script engine to
add.

Now, your perf_event_example.py library goes deeper and exposes the
perf.data itself as an independent codepath. I _think_ Tom's approach
gives us a bit of an extra value by allowing us to tweak the environment
of scripts with each perf version - i.e. we can iterate the perf.data
format in the future without breaking scripts.

We are not ready yet to declare perf.data an ABI, and there's a few
changes in tip:perf/* that might break the python library.

Also, as your fix demonstrates it, there's extra value in going
ab-initio as well. Just wanted to mention the scripting engine work to
couple perf with scriptlets, in case you find it interesting. We could
easily do both.

> One problem this has just found though, is with PERF_EVENT_SAMPLE:
>
> # FIXME: If sampling multiple events we have an issue
> # here. Since the SAMPLE_ID is not the first optional field
> # it might be impossible to differentiate between
> # events since the SAMPLE_ID field would be at different
> # offsets. For now we assume all events use the same
> # set of optional fields.
> eventnr = 0
> self.event = sample_event(eventbuf,
> self.header.attrs[eventnr].sample_type)
>
> It seems like the API allows us to specify different sample options
> for different events, but since the ID isnt the first option it could
> end up in different places in different events, making it difficult
> (if not impossible in some cases) to tag events correctly.

Could we fix this bug at the kernel level somehow, to imply SAMPLE_ID
automatically? Producing a stream of data that cannot be decoded in some
cases does not look smart.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/