[PATCH] perf record: Allow poll timeout to be specified

From: David Ahern
Date: Tue Mar 24 2015 - 12:10:31 EST


Record currently wakes up based on watermarks to read events from the mmaps and
write them out to the file. The result is a file that can have large blocks of
events per mmap before a finished round event is added to the stream. This in
turn affects the quantity of events that have to be passed through the ordered
events queue before results can be displayed to the user. For commands like
perf-script this can lead to long unnecessarily long delays before a user gets
output. Large systems (e.g, 1024 cpus) further compound this effect. I have seen
instances where I have to wait 45 minutes for perf-script to process a 5GB file
before any events are shown.

This patch adds an option to perf-record to allow a user to specify the
poll timeout in msec. For example using 100 msec timeouts similar to perf-top
means the mmaps are traversed much more frequently leading to a smoother
analysis side.

Signed-off-by: David Ahern <david.ahern@xxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
Cc: Stephane Eranian <eranian@xxxxxxxxxx>
Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
---
tools/perf/Documentation/perf-record.txt | 6 ++++++
tools/perf/builtin-record.c | 5 ++++-
tools/perf/perf.h | 1 +
3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 355c4f5569b5..7010c363fdd1 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -250,6 +250,12 @@ is off by default.
--running-time::
Record running and enabled time for read events (:S)

+--poll=::
+Polling interval in msec. Defaults to infinite which means record relies on
+watermarks to wakeup and read events from each mmap. Setting poll helps smooth
+the event collection across mmaps and the subsequent processing of the data
+file. For example perf-top uses a 100 msec polling interval.
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5a2ff510b75b..091868288d29 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -485,7 +485,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
if (hits == rec->samples) {
if (done || draining)
break;
- err = perf_evlist__poll(rec->evlist, -1);
+ err = perf_evlist__poll(rec->evlist, opts->poll_timeout);
/*
* Propagate error, only if there's any. Ignore positive
* number of returned events and interrupt error.
@@ -734,6 +734,7 @@ static struct record record = {
.user_freq = UINT_MAX,
.user_interval = ULLONG_MAX,
.freq = 4000,
+ .poll_timeout = -1,
.target = {
.uses_mmap = true,
.default_per_cpu = true,
@@ -841,6 +842,8 @@ struct option __record_options[] = {
"Sample machine registers on interrupt"),
OPT_BOOLEAN(0, "running-time", &record.opts.running_time,
"Record running/enabled time of read (:S) events"),
+ OPT_INTEGER(0, "poll", &record.opts.poll_timeout,
+ "poll interval in ms (defaults to infinite)"),
OPT_END()
};

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 1caa70a4a9e1..ee847c8af668 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -62,6 +62,7 @@ struct record_opts {
u64 user_interval;
bool sample_transaction;
unsigned initial_delay;
+ int poll_timeout;
};

struct option;
--
2.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/