Re: [RFC 1/4] perf kvm: Enable 'record' on powerpc

From: Ravi Bangoria
Date: Mon May 09 2016 - 10:58:34 EST


Hi Arnaldo,

Thanks for the review. Please find my comments below.

On Thursday 28 April 2016 03:17 AM, Arnaldo Carvalho de Melo wrote:
Em Wed, Apr 27, 2016 at 06:02:21PM +0530, Ravi Bangoria escreveu:
Hi Arnaldo,

I've worked on your patch. I'm sending this patch(diff) to check if this
is the same idea you want to progress with. I cleanup your patch,
removed arch specific compile time directives and changed code to
enable cross arch reporting. I tested record on powerpc and report
on x86 and it's working.

Please give suggestion about your approach. Let me know if you have
some other idea to progress with.

Here is the diff w.r.t perf/cpumode branch:

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index bff6664..83ef6c6 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1480,6 +1480,60 @@ perf_stat:
}
#endif /* HAVE_KVM_STAT_SUPPORT */

+#define PPC_HV_DECREMENTER 2432
+#define PPC_HV_BIT 3
+#define PPC_PR_BIT 49
+#define PPC_MAX 63
+
+static bool perf_sample__is_hv_dec_trap(struct perf_sample *sample, struct
perf_evsel *evsel)
+{
+ int trap = perf_evsel__intval(evsel, sample, "trap");
+ return trap == PPC_HV_DECREMENTER;
+}
+
+static void perf_kvm__munge_ppc_guest_sample(struct perf_evsel *evsel,
struct perf_sample *sample)
+{
+ unsigned long msr, hv, pr;
+
+ if (!perf_sample__is_hv_dec_trap(sample, evsel))
+ return;
+
+ sample->ip = perf_evsel__intval(evsel, sample, "pc");
+ sample->cpumode = PERF_RECORD_MISC_GUEST_KERNEL;
+
+ msr = perf_evsel__intval(evsel, sample, "msr");
+ hv = msr & ((unsigned long)1 << (PPC_MAX - PPC_HV_BIT));
+ pr = msr & ((unsigned long)1 << (PPC_MAX - PPC_PR_BIT));
+ if (!hv && pr)
+ sample->cpumode = PERF_RECORD_MISC_GUEST_USER;
+}
+
+static bool perf_evlist__recorded_on_ppc(const struct perf_evlist *evlist)
+{
+ if (evlist->env && evlist->env->arch) {
+ return !strcmp(evlist->env->arch, "ppc64") ||
+ !strcmp(evlist->env->arch, "ppc64le");
+ }
+ return false;
+}
+
+int perf_kvm__setup_munge_ppc_guest_event(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel;
+ const char name[] = "kvm_hv:kvm_guest_exit";
+
+ if (!perf_evlist__recorded_on_ppc(evlist))
+ return 0;
+
+ evsel = perf_evlist__find_tracepoint_by_name(evlist, name);
+ if (evsel == NULL)
+ return -1;
+
+ evsel->munge_sample = perf_kvm__munge_ppc_guest_sample;
+
+ return 0;
+}
+
static int __cmd_record(const char *file_name, int argc, const char **argv)
{
int rec_argc, i = 0, j;
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ab47273..7cb41f7 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -879,6 +879,12 @@ repeat:
if (session == NULL)
return -1;

+ if (perf_guest &&
+ perf_kvm__setup_munge_ppc_guest_event(session->evlist)) {
+ pr_err("PPC event not present in %s file\n", input_name);
+ goto error;
+ }
This looks out of place, i.e. this reads: "For all cases where there is
a guest and we can't setup the ppc KVM guest related stuff, its an
error"

I think this gets clearer as:

if (perf_guest && perf_evlist__recorded_on_ppc(evlist) &&
perf_kvm__setup_munge_ppc_guest_event(session->evlist)) {
pr_err("PPC event not present in %s file\n", input_name);
goto error;
}

Then we read this as "Hey, if this was recorded on ppc, try to set
things up for ppc",

Yes I'll change this.

but then again, what is this KVM stuff doing in the
generic 'perf report' code?

Basically we are checking if data recorded on ppc in perf.data file. Which
can be done after opening a file and mapping header info in evlist. And
evlist is initialized in builtin-record.c only. So, I don't see any possibility to
move this in builtin-kvm.c. Kindly guide how can we do it.

What if this is a perf.data file generated on PPC but being read on PPC?
This will not make sense to munge it, right?

If you are asking about normal(without kvm) perf record and report, it's
working with this patch. Otherwise can you please explain little bit more.

But yes, we can change this code like this:

if (perf_guest && perf_evlist__recorded_on_ppc(session->evlist))
perf_kvm__setup_munge_ppc_guest_event(session->evlist);

and change definition of perf_kvm__setup_munge_ppc_guest_event as:

void perf_kvm__setup_munge_ppc_guest_event(struct perf_evlist *evlist)
{
struct perf_evsel *evsel;
const char name[] = "kvm_hv:kvm_guest_exit";

evsel = perf_evlist__find_tracepoint_by_name(evlist, name);
if (evsel == NULL)
return;

evsel->munge_sample = perf_kvm__munge_ppc_guest_sample;
}


This is with what I remember from this case, please bear with me.

Regards,
Ravi