Re: perf_counter: request for three more sample data options

From: Peter Zijlstra
Date: Fri Apr 03 2009 - 12:58:23 EST


On Fri, 2009-04-03 at 18:41 +0200, Ingo Molnar wrote:
> * Robert Richter <robert.richter@xxxxxxx> wrote:
>
> > On 03.04.09 19:51:11, Paul Mackerras wrote:
> > > Peter Zijlstra writes:
> > >
> > > > What I was thinking of was re-using some of the cpu_clock()
> > > > infrastructure. That provides us with a jiffy based GTOD sample,
> > > > cpu_clock() then uses TSC and a few filters to compute a current
> > > > timestamp.
> > > >
> > > > I was thinking about cutting back those filters and thus trusting the
> > > > TSC more -- which on x86 can do any random odd thing. So provided the
> > > > TSC is not doing funny the results will be ok-ish.
> > > >
> > > > This does mean however, that its not possible to know when its gone bad.
> > >
> > > I would expect that perfmon would be just reading the TSC and
> > > recording that. If you can read the TSC and do some correction then
> > > we're ahead. :)
> > >
> > > > The question to Paul is, does the powerpc sched_clock() call work in NMI
> > > > -- or hard irq disable -- context?
> > >
> > > Yes - timekeeping is one area where us powerpc guys can be smug.
> > > :) We have a per-core, 64-bit timebase register which counts at
> > > a constant frequency and is synchronized across all cores. So
> > > sched_clock works in any context on powerpc - all it does is
> > > read the timebase and do some simple integer arithmetic on it.
> >
> > Ftrace is using ring_buffer_time_stamp() that finally uses
> > sched_clock(). But I am not sure if the time is correct when
> > calling from an NMI handler.
>
> Yeah, that's a bit icky. Right now we have the following
> accelerator:
>
> u64 sched_clock_cpu(int cpu)
> {
> u64 now, clock, this_clock, remote_clock;
> struct sched_clock_data *scd;
>
> if (sched_clock_stable)
> return sched_clock();
>
> which works rather well on CPUs that set sched_clock_stable. Do you
> think we could set it on Barcelona?

I think you should couple it to the tsc clocksource detection thingy. On
all systems the tsc is good enough to use as clocksource, we can
short-circuit.

> in the non-stable case we chicken out:
>
> /*
> * Normally this is not called in NMI context - but if it is,
> * trying to do any locking here is totally lethal.
> */
> if (unlikely(in_nmi()))
> return scd->clock;
>
> as we'd have to take a spinlock which isnt safe from NMI context.

Right, I've been looking at doing cpu_clock() differently, but since its
all 64-bit we'd either need to introduce atomic64 into the code, or redo
it in the perf counter code.

So for now I've stuck with a plain sched_clock() timestamp.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/