[RFC 0/2] Yet another take at user/kernel time correlation problem

From: Pawel Moll
Date: Fri Sep 12 2014 - 07:58:30 EST


Greetings,

Here comes yet another take at the problem of correlating perf
samples, timestamped in kernel (with - de-facto - sched_clock values),
with performance-related events (be it debug information from JIT
engines or energy sensor data obtained via USB or hwmon) generated
in user space.

The first patch adds an additional timestamp field in the perf
sample data, which can be requested for any perf event along
with normal PERF_SAMPLE_TIME. Events with both values appearing
periodically in the perf data allow user code to translate
raw monotonic time (obtained via POSIX clock API) to sched_clock
domain. Although any perf event can be used, the natural choice
would be a sched_switch trace event (for processes with root
permissions) or a hrtimer-based PERF_COUNT_SW_CPU_CLOCK.

One question I haven't found answer to is: could it be even more
generic? As in: would it be possible to request a time value from
any of the available time sources? It doesn't make sense, I
believe, to have a PERF_SAMPLE_* for each possibility. An extra
flags in perf_event_attrs maybe? (we still have 39 spare bits there)

The second patch, functionally orthogonal but complementing
the first one, replicates the "trace_maker" idea from ftrace
in the perf world. Instead of a sysfs file, there is an ioctl
command, which simply injects a new type of software event into
the buffer. The argument can point at a zero-terminated string
of PAGE_SIZE max lenght. If provided, it will be copied to
the "raw" part of a sample. Of course the event can sample
a monotonic clock as well, if used with the above, so one
gets means of both synchronisation and time stamp approximation.

One doubt I have here is the ioctl argument. It takes
a strong now, like trace_marker does. But maybe a simple
integer value would suffice? After all the ioctl can be
only generated by the "owner" of the perf stream (unlike
in trace_marker case, where "anyone" can write to it), so
we could rely on him to have a dictionary of events of
some sort.

On another note, I proposed this subject for the tracing
microconference on Plumbers next month. Hope to have some good
discussion there. Maybe even a conclusion? (I wish... ;-)

Thanks!

Pawel


Pawel Moll (2):
perf: Add sampling of the raw monotonic clock
perf: Marker software event and ioctl

include/linux/perf_event.h | 2 ++
include/uapi/linux/perf_event.h | 6 ++++-
kernel/events/core.c | 55 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 62 insertions(+), 1 deletion(-)

--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/