Re: [BUG] perf report: ordered events and flushing bug

From: Arnaldo Carvalho de Melo
Date: Thu Mar 12 2015 - 16:16:28 EST


Em Thu, Mar 12, 2015 at 01:53:29PM -0600, David Ahern escreveu:
> On 3/12/15 1:39 PM, Stephane Eranian wrote:
> >What the point of having all the ordered event logic if you are saying events
> >must be saved in order. I don't think there is a way to make that guarantee
> >when monitoring multiple CPUs at the same time.
>
> The record command does not analyze the events, it just copies from
> mmap to file in lumps per mmap. e.g., on a given round the perf data
> file has events like this:
>
> 111112223344444444555566666F111111111
> |<------- round --------->|^
> |
> finished round event -|
>
> where 11111 are events read from mmap1, 2222 are events from mmap2,
> etc. F is the finished round event which a pass over all mmaps has
> been done.
>
> So for mmap1 all of the 11111 events are in time order, then jumping
> to mmap2 events the 2222 times are time sorted relative to mmap2 but
> not relative to mmap1 events.
>
> The ordered events code sorts the clumps into a time based stream:
> 123141641445124564234645656...

And it does that because it merges all the mmap buffers into just one
file...

OK, for inserting MMAP events (or any other), I think one could either
use perf inject and merge two perf.data files, both in order, or add a
'perf data merge' subcommand to 'perf data', perhaps the later will be
useful in more cases.

But there is something else here, we should take advantage of the fact
that events in each perf mmap are ordered and keep that in the output of
perf record, i.e. we should start one thread per CPU that will just
write into a .perf.data/cpu-N file

Then, when reading, we will do what I'll do for 'trace' and 'top', i.e.
order the N cpus and go on processing in order, if you need that
(tracing, perf top perhaps).

Or do a first pass, get the lifetime events, aka the PERF_RECORD_
metadata, stash in the struct machine rbtrees, as usual, but keeping a
reference to all threadas, even the dead ones, which I guess is what
Namhyung does in some way in his patchkit, then go wild processing the
samples in parallel.

So, I think for Stephane, right now, the easiest path to follow is to
hack 'perf inject' to insert the MMAP events where he needs, right?

Agreed?

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/