Re: [PATCH v7 3/3] KVM: perf: kvm events analysis tool

From: David Ahern
Date: Thu Sep 13 2012 - 10:14:32 EST

On 9/13/12 7:45 AM, Arnaldo Carvalho de Melo wrote:
Em Wed, Sep 12, 2012 at 10:56:44PM -0600, David Ahern escreveu:
static const char * const kvm_usage[] = {
+ "perf kvm [<options>] {top|record|report|diff|buildid-list|stat}",

The usage for the report/record sub commands of stat is never shown. e.g.,
$ perf kvm stat
--> shows help for perf-stat

$ perf kvm
--> shows the above and perf-kvm's usage

[I deleted this thread, so having to reply to one of my responses.
hopefully noone is unduly harmed by this.]

I've been using this command a bit lately -- especially on nested
virtualization -- and I think the syntax is quirky - meaning wrong.
In my case I always follow up a record with a report and end up
using a shell script wrapper that combines the 2 and running it
repeatedly. e.g.,

$PERF kvm stat record -o $FILE -p $pid -- sleep $time
[ $? -eq 0 ] && $PERF --no-pager kvm -i $FILE stat report

As my daughter likes to say - awkward.

That suggests what is really needed is a 'live' mode - a continual
updating of the output like perf top, not a record and analyze later
mode. Which does come back to why I responded to this email -- the
syntax is klunky and awkward.

So, I spent a fair amount of time today implementing a live mode.
And after a lot of swearing at the tracepoint processing code I

What kind of swearing? I'm working on 'perf test' entries for
tracepoints to make sure we don't regress on the perf/libtraceevent
junction, doing that as prep work for further simplifying tracepoint
tools like sched, kvm, kmem, etc.

Have you seen how the tracing initialization is done? ugly. record generates tracing data events and report uses those to do the init so you can access the raw_data. I ended up writing this:

static int perf_kvm__tracing_init(void)
struct tracing_data *tdata;
char temp_file[] = "/tmp/perf-XXXXXXXX";
int fd;

fd = mkstemp(temp_file);
if (fd < 0) {
pr_err("mkstemp failed\n");
return -1;

tdata = tracing_data_get(&kvm_events.evlist->entries, fd, false);
if (!tdata)
return -1;
lseek(fd, 0, SEEK_SET);
(void) trace_report(fd, &kvm_events.session->pevent, false);

return 0;

finally have it working. And the format extends easily (meaning <
day and the next step) to a perf-based kvm_stat replacement. Example
syntax is:

perf kvm stat [-p <pid>|-a|...]

which defaults to an update delay of 1 second, and vmexit analysis.

The guts of the processing logic come from the existing kvm-events
code. The changes focus on combining the record and report paths
into one. The display needs some help (Arnaldo?), but it seems to
work well.

I'd like to get opinions on what next? IMO, the record/report path
should not get a foot hold from a backward compatibility perspective
and having to maintain those options. I am willing to take the
existing patches into git to maintain authorship and from there
apply patches to make the live mode work - which includes a bit of
refactoring of perf code (like the stats changes).

Before I march down this path, any objections, opinions, etc?

Can I see the code?

Let me clean it up over the weekend and send out an RFC for it.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at