Re: [RFC] perf: need to expose sched_clock to correlate user sampleswith kernel samples

From: John Stultz
Date: Sat Feb 23 2013 - 01:04:52 EST


On 02/20/2013 02:29 AM, Peter Zijlstra wrote:
On Tue, 2013-02-19 at 10:25 -0800, John Stultz wrote:
So describe how the perf time domain is different then
CLOCK_MONOTONIC_RAW.
The primary difference is that the trace/sched/perf time domain is not
strictly monotonic, it is only locally monotonic -- that is two time
stamps taken on the same cpu are guaranteed to be monotonic.

So how would a clock_gettime(CLOCK_PERF,...) interface help you figure out which cpu you got your timestamp from?


Furthermore, to make it useful, there's an actual bound on the inter-cpu
drift (implemented by limiting the drift to CLOCK_MONOTONIC).

So this sounds like you're already sort of interpolating to CLOCK_MONOTONIC, or am I just misunderstanding you?

Additionally -- to increase use -- we also added a monotonic sync point
when cpu A queries time of cpu B.

Not sure I'm following this bit. But I'll have to go look at the code on Monday.


My concern here is that we're basically creating a kernel interface
that
exports implementation-defined semantics (again: whatever perf does
right now). And I think folks want to do this, because adding
CLOCK_PERF
is easier then trying to:

1) Get a lock-free method for accessing CLOCK_MONOTONIC_RAW

2) Having perf interpolate its timestamps to CLOCK_MONOTONIC, or
CLOCKMONOTONIC_RAW when it exports the data
Mostly cheaper, not easier. Given unstable TSC, MONOTONIC will have to
fall back to another clock source (hpet, acpi_pm and other assorted
crap).

In order to avoid this, we'd had to relax the requirements. Using
anything other than TSC is simply not an option.

Right, and this I understand. We can can play a little fast and lose with the rules for in-kernel uses, given the variety of hardware and the fact that performance is more critical then perfect accuracy. Since we're in-kernel we also have more information then userland does about what cpu we're running on, so we can get away with only locally-monotonic timestamps.

But I want to be careful if we're exporting this out to userland that its both useful and that there's an actual specification for how CLOCK_PERF behaves, applications can rely upon not changing in the future.

thanks
-john


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/