Re: [PATCH -tip v5 2/5] perf: set correct value to perf_event_header.miscfor BTS

From: Akihiro Nagai
Date: Thu Mar 15 2012 - 22:22:50 EST


(2012/03/07 2:32), Peter Zijlstra wrote:
On Tue, 2012-02-21 at 14:39 +0900, Akihiro Nagai wrote:
(2012/01/30 18:35), Peter Zijlstra wrote:
On Mon, 2012-01-30 at 13:43 +0900, Akihiro Nagai wrote:
@@ -330,6 +342,14 @@ int intel_pmu_drain_bts_buffer(void)
return 1;

for (; at< top; at++) {
+ /*
+ * To resolve user space symbols and DSOs correctly, set
+ * PERF_RECORD_MISC_USER if from_addr or to_addr is user space.
+ */
+ if (!kernel_ip(data.ip) || !kernel_ip(data.addr)) {
+ header.misc&= ~PERF_RECORD_MISC_CPUMODE_MASK;
+ header.misc |= PERF_RECORD_MISC_USER;
+ }
data.ip = at->from;
data.addr = at->to;

Why not key off of the from? If its a jump from userspace its a user
event, its a jump from kernel space its a kernel event?

Of course, originally, perf does that.

I don't think it does.

And, in those cases,
BTS records the both addresses of kernel and user in one
perf_sample on branches from kernel to user.

Sorry, I don't get this.

Current perf sets PERF_RECORD_MISC_KERNEL to all BTS events,

It doesn't, it does something far more stupid.. it sets the state to
wherever we were when the BTS overflow interrupt happens. If that was in
kernel space, we mark all of them as in-kernel, if that was in
user-space we mark all of then in-user.

Normally, perf does. However, BTS (intel_pmu_drain_bts_buffer) uses
local-defined and not-initialized pt_regs to decide PERF_RECORD_MISC_KERNEL or USER.
Accordingly, almost all BTS events are set PERF_RECORD_MISC_KERNEL.

BTW, I think it's a bug that uses not-initialized pt_regs.
So, it needs initialize:
- struct pt_regs regs;
+ struct pt_regs regs = {0};

Or, intel_pmu_drain_bts_buffer() get pt_regs from a function argument.
- int intel_pmu_drain_bts_buffer(void)
+ int intel_pmu_drain_bts_buffer(struct pt_regs *regs)


and
perf-script doesn't resolve symbols and DSOs about the
user-space address, because it is a kernel event.

Well, that's a perf-script problem, not something you should change the
kernel for.

Exactly. I'm going to fix perf-script.


So how about something like this?

---
arch/x86/kernel/cpu/perf_event_intel_ds.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index d6bd49f..81e788c 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -333,6 +333,13 @@ int intel_pmu_drain_bts_buffer(void)
data.ip = at->from;
data.addr = at->to;

+ /* XXX doesn't do virt muck properly */
+ header.misc&= ~PERF_RECORD_MISC_CPUMODE_MASK;
+ if (kernel_ip(at->from))
+ header.misc |= PERF_RECORD_MISC_KERNEL;
+ else
+ header.misc |= PERF_RECORD_MISC_USER;
+
perf_output_sample(&handle,&header,&data, event);
}

It looks good. I think it's correct semantically to set
PERF_RECORD_MISC_KERNEL or USER by the from-address.
However, this patch needs perf-script fix too.
I'm going to send next patches including this patch.

Thank you,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/