Re: Tracing Requirements (was: [RFC/Requirements/Design] h/w errorreporting)

From: Mathieu Desnoyers
Date: Wed Nov 10 2010 - 18:28:38 EST


* Thomas Gleixner (tglx@xxxxxxxxxxxxx) wrote:
> On Wed, 10 Nov 2010, Mathieu Desnoyers wrote:
>
> > * Luck, Tony (tony.luck@xxxxxxxxx) wrote:
> > > >- Perf does not support flight recorder tracing (concurrent read/write)
> > > > - Sub-buffers are needed to support concurrent read/writes
> > >
> > > When I hear somebody say "flight recorder" - I think of "black boxes"
> > > in airplanes that log data while the flight is running, and are only
> > > looked at offline later. So I'm confused by the "concurrent read/write"
> > > requirement.
> > >
> > > Perhaps you could explain the use cases of your "flight recorder",
> > > because it seems that the name doesn't fit exactly, and this is
> > > causing me (and maybe others) some confusion.
> >
> > As Steven pointed out, the flight recorder buffers are set to overwrite the
> > oldest data when the buffer is filled. Therefore, the tracer can be used in
> > close-circuit mode (without extracting the data out of the memory buffers) to
> > keep a trace of the recent events. The trace can be extracted when an
> > interesting condition (trigger) occurs.
> >
> > A typical use-case is to let it run on an end-user machine to enhance
> > application crash diagnosis with tracing information, albeit using a very small
> > fraction of the system resources to do so.
> >
> > The reason why "concurrent read/write" is required is for server-class machines
> > which needs to continuously be able to gather trace data to report/find/locate
> > problematic scenarios happening. This means we're not only interested in one
> > single failure, but rather by a whole set of erroneous/warning conditions that
> > need to be reported. Stopping tracing every time data is gathered is
> > inappropriate, because it would hide errors/warnings that would be happening
> > during data collection.
>
> Aargh! Just because it can be done all in one with an insane amount of
> complexity does not mean that it's an absolute requirement and a good
> solution.
>
> So if you want to have both the flight recorder crash documentation
> and the ongoing monitoring then use two separate sessions with
> separate modes and be done with it.
>
> Cramming both into the same session is just insane.
>

I'm afraid this is not what I proposed above. I'm open to use different tracing
sessions for different things. However, the server-class case needs to
continuously gather data so that "trace-shots" can be gathered when problems
occur. But if you hit two problems back to back, you don't want to lose the
trace leading to the second issue. Hence the motivation for supporting
concurrent reading while writing.

> The first rule is "Keep It Simple!". Period.

I'd like to start with an implementation that skips some of these requirements
initially, but what I really think we need to figure out is how we organize our
ABIs to finally support these requirements.

Thanks,

Mathieu

>
> Thanks,
>
> tglx

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/