Re: [PATCH 5/6] perf_counter: add more context information

From: Peter Zijlstra
Date: Mon Apr 06 2009 - 06:59:51 EST


On Fri, 2009-04-03 at 11:25 -0700, Corey Ashford wrote:
> Peter Zijlstra wrote:
> > On Thu, 2009-04-02 at 11:12 +0200, Peter Zijlstra wrote:
> >> plain text document attachment (perf_counter_callchain_context.patch)
> >> Put in counts to tell which ips belong to what context.
> >>
> >> -----
> >> | | hv
> >> | --
> >> nr | | kernel
> >> | --
> >> | | user
> >> -----
> >
> > Right, just realized that PERF_RECORD_IP needs something similar if one
> > if not able to derive the context from the IP itself..
> >
> Three individual bits would suffice, or you could use a two-bit code -
> 00 = user
> 01 = kernel
> 10 = hypervisor
> 11 = reserved (or perhaps unknown)
>
> Unfortunately, because of alignment, it would need to take up another 64
> bit word, wouldn't it? Too bad you cannot sneak the bits into the IP in
> a machine independent way.
>
> And since you probably need a separate word, that effectively doubles
> the amount of space taken up by IP samples (if we add a "no event
> header" option). Should we add another bit in the record_type field -
> PERF_RECORD_IP_LEVEL (or similar) so that user-space apps don't have to
> get this if they don't need it?

If we limit the event size to 64k (surely enough, right? :-), then we
have 16 more bits to play with in the header, and we could do something
like the below.

A further possibility would also be to add an overflow bit in there,
making the full 32bit PERF_RECORD space available to output events as
well.

Index: linux-2.6/include/linux/perf_counter.h
===================================================================
--- linux-2.6.orig/include/linux/perf_counter.h
+++ linux-2.6/include/linux/perf_counter.h
@@ -201,9 +201,17 @@ struct perf_counter_mmap_page {
__u32 data_head; /* head in the data section */
};

+enum {
+ PERF_EVENT_LEVEL_HV = 0,
+ PERF_EVENT_LEVEL_KERNEL = 1,
+ PERF_EVENT_LEVEL_USER = 2,
+};
+
struct perf_event_header {
__u32 type;
- __u32 size;
+ __u16 level : 2,
+ __reserved : 14;
+ __u16 size;
};

enum perf_event_type {
Index: linux-2.6/kernel/perf_counter.c
===================================================================
--- linux-2.6.orig/kernel/perf_counter.c
+++ linux-2.6/kernel/perf_counter.c
@@ -1832,6 +1832,8 @@ static void perf_counter_output(struct p

header.type = PERF_EVENT_COUNTER_OVERFLOW;
header.size = sizeof(header);
+ header.level = user_mode(regs) ?
+ PERF_EVENT_LEVEL_USER : PERF_EVENT_LEVEL_KERNEL;

if (record_type & PERF_RECORD_IP) {
ip = instruction_pointer(regs);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/