Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist

From: Adrian Hunter
Date: Thu Oct 08 2015 - 12:09:36 EST


On 7/10/2015 12:06 p.m., Namhyung Kim wrote:
Hi Adrian,

On Tue, Oct 6, 2015 at 6:26 PM, Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
On 06/10/15 12:03, Namhyung Kim wrote:
Hi Adrian,

On Mon, Oct 5, 2015 at 8:29 PM, Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
On 02/10/15 21:45, Arnaldo Carvalho de Melo wrote:
Em Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim escreveu:
Since it's gonna share struct mmap with dummy tracking evsel to track
meta events only, let's move auxtrace out of struct perf_mmap.
Is this moving around _strictly_ needed?

Also, what if you wanted to capture AUX data and tracking together.

Hmm.. I don't know what's the problem. It should be orthogonal and
support doing that together IMHO. Maybe I'm missing something about
the aux data processing and Intel PT. I'll take a look at it..


It is only orthogonal if you assume we will never want to support parallel
processing with Intel PT.

We'll definitely want it. :)


The only change that needs to be made is not to assume there is only 1
tracking event.

Sorry for the slow reply.


IIUC Intel PT (and BTS?) needs maximum 2 dummy events - one is to
track task/mmap and another is to track context switches. The latter
is basically a light-weight version of the sched_switch event, right?

Yes


For parallel processing, each cpu needs to keep current thread to
synthesize events from auxtrace data. So if it processed the switch
events before processing samples, it'd need to build long lists of
current thread per cpu. IMHO it'd be better to process the switch
events with samples using multi-thread rather than processing them
prior to samples.

That is a good point.

But that would be limited to dividing the data by cpu. It would be more
useful to divide it any which way. Does 'perf report' care if the
data is not in order?

So how about this? It'd use *always* 2 dummy (or 1 dummy + 1
sched_switch) events. The tracking dummy events would be recorded on
the tracking mmaps and switch (dummy) event would be recorded on the
main mmaps. This way we can parallelize the auxtrace processing
without the list of current thread IMHO.

Do I miss something?

Thinking about it now, it would probably make sense to put the AUX
event with the tracking events as well, so the data can be queued up
ready for processing, then the AUX index would not be needed. But of
course, if there were no other events, then there would be no main
mmap at all.

From that point of view, I guess I don't need to worry about splitting
up the mmaps at all, just process them more than once if need be.



IMHO there could be separate mmap_params also, which would allow for
different mmap sizes for the tracking and main mmaps.

Currently, the tracking mmap size is fixed at an arbitrary size
(128KiB) regardless of the main mmaps. I can add an option to change
the tracking mmap size too.

I meant more from the program point of view, to allow different parameters.
Such as allowing one mmap to be PROT_READ and the other PROT_READ|PROT_WRITE
i.e. collect all the tracking events but let the other events overwrite
- perhaps as some kind of snapshot mode like we do with Intel PT.

It seemed to me that it would be more flexible to put evsels into mmap
groups. Then those groups could have any events or be used in various ways.
I also thought it might make the mmap code more readable, instead of having
lots of "if tracking event do something different".

On the other hand, it is just a thought. As I mentioned above, I realized
I could probably manage without splitting the mmaps.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/