Re: [RFC] perf: need to expose sched_clock to correlate usersamples with kernel samples

From: Peter Zijlstra
Date: Mon Feb 25 2013 - 09:11:26 EST


On Fri, 2013-02-22 at 22:04 -0800, John Stultz wrote:
> On 02/20/2013 02:29 AM, Peter Zijlstra wrote:
> > On Tue, 2013-02-19 at 10:25 -0800, John Stultz wrote:
> >> So describe how the perf time domain is different then
> >> CLOCK_MONOTONIC_RAW.
> > The primary difference is that the trace/sched/perf time domain is not
> > strictly monotonic, it is only locally monotonic -- that is two time
> > stamps taken on the same cpu are guaranteed to be monotonic.
>
> So how would a clock_gettime(CLOCK_PERF,...) interface help you figure
> out which cpu you got your timestamp from?

I'm not sure we want to expose it that far.. The reason people want
this clock exposed is to be able to do logging on the same time-line so
we can correlate events from both sources (kernel and user-space).

In case of parallel execution we cannot guarantee order and reading
logs/reconstructing events things require a bit of human intelligence.

> > Furthermore, to make it useful, there's an actual bound on the inter-cpu
> > drift (implemented by limiting the drift to CLOCK_MONOTONIC).
>
> So this sounds like you're already sort of interpolating to
> CLOCK_MONOTONIC, or am I just misunderstanding you?

That's right, although there's modes where the TSC is guaranteed stable
where we don't do this (it avoids some expensive bits), so we can not
rely on this.

> > Additionally -- to increase use -- we also added a monotonic sync point
> > when cpu A queries time of cpu B.
>
> Not sure I'm following this bit. But I'll have to go look at the code
> on Monday.

It will basically pull the 'slowest' cpu forward so that for that
'event' we can say the two time-lines have a common point.

> Right, and this I understand. We can can play a little fast and lose
> with the rules for in-kernel uses, given the variety of hardware and the
> fact that performance is more critical then perfect accuracy. Since
> we're in-kernel we also have more information then userland does about
> what cpu we're running on, so we can get away with only
> locally-monotonic timestamps.
>
> But I want to be careful if we're exporting this out to userland that
> its both useful and that there's an actual specification for how
> CLOCK_PERF behaves, applications can rely upon not changing in the future.

Well, the timestamps themselves are already exposed to userspace
through the ftrace and perf data logs. All people want is to add
secondary data stream in the same time-line.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/