Re: [PATCH] perf tools: Do not fail on processing out of order event

From: Jiri Olsa
Date: Thu Nov 27 2014 - 09:07:38 EST


On Thu, Nov 27, 2014 at 11:53:00AM -0200, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 27, 2014 at 11:56:03AM +0100, Ingo Molnar escreveu:
> >
> > * Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
> >
> > > Em Wed, Nov 26, 2014 at 04:39:31PM +0100, Jiri Olsa escreveu:
> > > > Linus reported perf report command being interrupted due to
> > > > processing of 'out of order' event, with following error:
> > > >
> > > > Timestamp below last timeslice flush
> > > > 0x5733a8 [0x28]: failed to process type: 3
> > > >
> > > > I could reproduce the issue and in my case it was caused by one
> > > > CPU (mmap) being behind during record and userspace mmap reader
> > > > seeing the data after other CPUs data were already stored.
> > > >
> > > > It'd be nice to find the source of this hiccup, meanwhile I'm
> > > > submitting change that does not kill the processing, but only
> > > > tracks the number of out of order events and warn user.
> > >
> > > Works, for me, thanks for the fast response!
> >
> > Thanks Jiri!
> >
> > > > --- a/tools/perf/util/session.c
> > > > +++ b/tools/perf/util/session.c
> > > > @@ -533,15 +533,11 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
> > > > return -ETIME;
> > > >
> > > > if (timestamp < oe->last_flush) {
> > > > - WARN_ONCE(1, "Timestamp below last timeslice flush\n");
> > > > -
> > > > - pr_oe_time(timestamp, "out of order event");
> > > > + pr_oe_time(timestamp, "out of order event\n");
> > > > pr_oe_time(oe->last_flush, "last flush, last_flush_type %d\n",
> > > > oe->last_flush_type);
> > > >
> > > > - /* We could get out of order messages after forced flush. */
> > > > - if (oe->last_flush_type != OE_FLUSH__HALF)
> > > > - return -EINVAL;
> > > > + s->stats.nr_unordered_events++;
> >
> > Btw., in the forced flush case we'll get out of order events that
> > are 'expected'. Shouldn't we count them separately and not warn
> > about them, or so?
> >
> > > > + if (session->stats.nr_unordered_events != 0) {
> > > > + ui__warning("%u out of order events recorded.\n",
> > > > + session->stats.nr_unordered_events);
> > > > + }
> >
> > Nit: I'd suggest keeping the message printout on a single line:
> >
> > if (session->stats.nr_unordered_events != 0) {
> > ui__warning("%u out of order events recorded.\n", session->stats.nr_unordered_events);
> >
> > as IMHO the cure for this col80 linebreak checkpatch warning is
> > worse than the disease! :-)
> >
> > Barring those details:
> >
> > Acked-by: Ingo Molnar <mingo@xxxxxxxxxx>
> >
> > Arnaldo, I suppose the fix will go upstream via your tree, as a
> > pull request for v3.18 fixes?
>
> Right, I can do that.
>
> And I agree with the "expected" for events that are out of the current
> reordering window, i.e. we can't insert something into previous windows,
> so those are, humm, what would be a good name:
> stats->out_of_reordering_window, while stats->reordered, would be for
> events that were found out of order, but were successfully sorted as
> part of a flush operation, right?

the forced flush is when we find out we crossed the allowed allocation
space for the samples queue.. so we take the half of the sorted queue
and flush it.. for this case we break the flushing logic and we might
(probably just in theory) get out of order events

but IMO both cases of out of order event are equal.. we dont do
anything special for forced flushed AFAIK

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/