Re: Shift by one instruction in the perf annotate output

From: Ingo Molnar
Date: Fri Jan 27 2012 - 05:27:57 EST



* Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:

> > I am running Linux and perf 3.2 but I remember that previous
> > versions suffered from the same issue.
> >
> > I donât know if it could be specific to my cpu:
> > processor : 0
> > vendor_id : GenuineIntel
> > cpu family : 6
> > model : 15
> > model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
>
> And sadly its the best you'll get on your machine, most Intel
> chips after that (including the core2 shrink, but excluding
> the latest core i7 SNB) can do better using a feature called
> PEBS.

Which can be activated on those CPUs using the '-e cycles:pp'
option (the first 'p' stands for 'precise', the second 'p' for
'very precise' ;-).

In that case some rather non-obvious perf magic is activated (we
use PEBS for precise samples and use the LBR hardware to rewind
the IP), due to which annotation output looks like this:

: ffffffff810a6f51 <do_raw_spin_lock>: â
1.77 : ffffffff810a6f51: mov $0x10000,%eax â
44.95 : ffffffff810a6f56: lock xadd %eax,(%rdi) â
1.25 : ffffffff810a6f5a: mov %eax,%edx â
0.29 : ffffffff810a6f5c: shr $0x10,%edx â
1.21 : ffffffff810a6f5f: cmp %dx,%ax â
0.01 : ffffffff810a6f62: je ffffffff810a6f6b <do_raw_spin_lock+0x1a> â
29.81 : ffffffff810a6f64: pause â
16.45 : ffffffff810a6f66: mov (%rdi),%ax â
4.27 : ffffffff810a6f69: jmp ffffffff810a6f5f <do_raw_spin_lock+0xe> â
0.00 : ffffffff810a6f6b: retq â

the entries are both precise and show up in the right place.

On Core2 CPUs there's PEBS so 'p' will work, but there's no LBR
so the IP-rewinding does not work.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/