Re: [PATCH 2/2] tracing/function-graph-tracer: use the morelightweight local clock

From: Ingo Molnar
Date: Thu Mar 05 2009 - 06:57:19 EST



* Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:

> On Thu, Mar 05, 2009 at 11:56:52AM +0100, Ingo Molnar wrote:
> >
> > * Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> >
> > > On Thu, Mar 05, 2009 at 08:30:13AM +0100, Peter Zijlstra wrote:
> > > > On Thu, 2009-03-05 at 02:19 +0100, Frederic Weisbecker wrote:
> > > >
> > > > > It takes 1 ms to execute while tracing.
> > > > > Considering my frequency is 250 Hz, it means 1/4 of the system is used
> > > > > on timer interrupt while tracing.
> > > > >
> > > > > For now the hang is fixed, but not the awful latency. And I'm just too frightened
> > > > > to test it on 1000 Hz.
> > > > >
> > > > > But I plan to add a kind of watchdog to check how many time we spent inside an
> > > > > interrupt while graph tracing.
> > > > > By checking this time against the current Hz value, I could decide to abort the tracing
> > > > > for all irq.
> > > >
> > > > That would basically render the thing useless :-(
> > >
> > >
> > > It would be only for slow machines :-)
> > > I'm talking about something that happened on a Pentium II.
> > >
> > >
> > > > Is it specifically function_graph that is so expensive? If so, is that
> > > > because of the function exit hook?
> > >
> > >
> > > Yes, specifically the function_graph, the function tracer is
> > > not concerned. The function graph tracer takes more than
> > > double overhead compared to the function tracer.
> > >
> > > Usually the function tracer hooks directly the the function
> > > that insert the event, it's pretty straightforward.
> > >
> > > The function graph does much more work:
> > >
> > > entry: basic checks, take the time, push the infos on the stack, insert an event
> > > on the ring-buffer, hook the return value.
> > > return: pop the infos from stack, insert an event on the ring-buffer, jump
> > > to the original caller.
> > >
> > > It has a high cost... which makes me sad because I plan to
> > > port it in on Arm and I fear the little Arm boad I recently
> > > purshased will not let me trace the interrupts without
> > > hanging...
> > > :-(
> > >
> > > I guess I should start thinking on some optimizations, perhaps
> > > using perfcounter?
> >
> > yeah. perfcounters and KernelTop might not work on a PII CPU out
> > of box though.
> >
> > But hacking perfcounters and looking at perfstat/kerneltop
> > output is serious amount of fun so if you are interested you
> > could try to implement support for it. Do you have any box where
> > perfcounters work? (that would be Core2 Intel boxes or pretty
> > much any AMD box)
> >
> > You could have a look at how oprofile works on your box - the
> > code for PII CPUs should be in
> > arch/x86/oprofile/op_model_ppro.c.
> >
> > There's also hardcoded support for a single perfcounter in the
> > nmi_watchdog=2 code, in arch/x86/kernel/cpu/perfctr-watchdog.c,
> > for pretty much any x86 CPU that has a PMU.
> >
> > Plus there's also the CPU documentation on Intel's site. It's
> > quite well written and pretty well structured. The URL for the
> > CPU's PMU ("Performance Monitoring") should be:
> >
> > http://download.intel.com/design/processor/manuals/253669.pdf
> >
> > As a last resort ;-)
> >
> > Ingo
>
> Ah yes, That could be fun!
> So, by reading your description, it should work on my labtop I guess?
>
> -> Intel(R) Pentium(R) Dual CPU T2310 @ 1.46GHz

Yeah, should work fine there - so that should be a good
reference point to start off. Let me know if you see any
bugs/problems.

> Anyway, I will give it a try and see what I can do.
> Thanks for the pointers.

You are welcome.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/