Re: Intel PT not work on 4.5.0-rc

From: Arnaldo Carvalho de Melo
Date: Tue Mar 01 2016 - 10:59:32 EST


Em Tue, Mar 01, 2016 at 05:40:49PM +0200, Adrian Hunter escreveu:
> On 25/02/16 23:59, Adrian Hunter wrote:
> > On 25/02/2016 4:21 p.m., Arnaldo Carvalho de Melo wrote:
> >> Hi Adrian,
> >>
> >> While fixing a problem I introducing in parsing 'intel_pt//',
> >> i.e. the empty '//' part, which is fixed in my perf/core by now, I tried
> >> using intel_pt with 4.5.0-rc4 and couldn't get any trace :-\
> >>
> >> Talking with Jiri he tried it on his side, also a Broadwell
> >> machine, and it worked, same tooling (acme/perf/core) but with a fedora
> >> kernel, 4.3.5-200.fc22.x86_64.
> >>
> >> Later I'll try switching to the kernel Jiri is using to see if it works
> >> for me with that one.
> >>
> >> I'm busy now and this Broadwell machine is my main workstation,
> >> can you check if you reproduce this problem?
> >
> > I am on vacation until some time next week. I will try to look at
> > it before then but might not manage it.
>
> Checked it now and was not able to reproduce.
>
> Was this happening every time, or did it just happen once?
>
> One possible explanation is that the AUX area mmaps fail or don't
> happen and that is somehow not reported as an error.
>
> One way to check that is:
>
> # strace perf record -e intel_pt//u -a sleep 0.001 2>&1 | grep mmap | grep MAP_SHARED
> mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f6627239000
> mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0x81000) = 0x7f6623927000
> mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0x7f66271b8000
> mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0x81000) = 0x7f6623527000
> mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_SHARED, 5, 0) = 0x7f6627137000
> mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED, 5, 0x81000) = 0x7f6623127000
> mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_SHARED, 7, 0) = 0x7f66230a6000
> mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED, 7, 0x81000) = 0x7f6622ca6000
>
> That shows the normal events buffer (512KiB + 1 page) and the
> AUX area buffer (4MiB) for each of 4 cpus.

[root@jouet linux]# perf trace --no-inherit -e mmap perf record -e intel_pt//u -a sleep 0.001 2>&1 | grep SHARED
25.797 ( 0.067 ms): mmap(len: 528384, prot: READ|WRITE, flags: SHARED, fd: 3<anon_inode:[perf_event]>) = 0x7f0250418000
26.168 ( 0.367 ms): mmap(len: 4194304, prot: READ|WRITE, flags: SHARED, fd: 3<anon_inode:[perf_event]>, off: 528384) = 0x7f024c47b000
26.264 ( 0.076 ms): mmap(len: 528384, prot: READ|WRITE, flags: SHARED, fd: 4<anon_inode:[perf_event]>) = 0x7f0250397000
26.631 ( 0.364 ms): mmap(len: 4194304, prot: READ|WRITE, flags: SHARED, fd: 4<anon_inode:[perf_event]>, off: 528384) = 0x7f024c07b000
26.723 ( 0.081 ms): mmap(len: 528384, prot: READ|WRITE, flags: SHARED, fd: 5<anon_inode:[perf_event]>) = 0x7f0250316000
27.094 ( 0.365 ms): mmap(len: 4194304, prot: READ|WRITE, flags: SHARED, fd: 5<anon_inode:[perf_event]>, off: 528384) = 0x7f024bc7b000
27.203 ( 0.099 ms): mmap(len: 528384, prot: READ|WRITE, flags: SHARED, fd: 7<anon_inode:[perf_event]>) = 0x7f024bbfa000
Binary file (standard input) matches
[root@jouet linux]#

It somehow misses one mmap :-/ Will check where is the problem, haven't see any
THROTTLE or LOST events :-\

Anyway, 'strace' shows the 4 pairs of mmaps, but it still doesn't work, rebooting:

[root@jouet linux]# strace -e mmap perf record -e intel_pt//u -a sleep 0.001 2>&1 | grep SHARED
mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7fc3f8818000
mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0x81000) = 0x7fc3f487b000
mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0x7fc3f8797000
mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0x81000) = 0x7fc3f447b000
mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_SHARED, 5, 0) = 0x7fc3f8716000
mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED, 5, 0x81000) = 0x7fc3f407b000
mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_SHARED, 7, 0) = 0x7fc3f3ffa000
mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED, 7, 0x81000) = 0x7fc3f3bfa000
Binary file (standard input) matches
[root@jouet linux]# perf script
[root@jouet linux]#

>
> >
> >>
> >> [root@jouet ~]# perf --version
> >> perf version 4.5.rc4.gbb109a
> >> [root@jouet ~]# perf record -e intel_pt//u -a sleep 10
> >> [ perf record: Woken up 1 times to write data ]
> >> [ perf record: Captured and wrote 1.767 MB perf.data ]
> >> [root@jouet ~]# perf evlist -v
> >> intel_pt//u: type: 7, size: 112, config: 0x400, { sample_period,
> >> sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER, read_format:
> >> ID, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1,
> >> sample_id_all: 1
> >> dummy:u: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }:
> >> 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER, read_format: ID, disabled: 1,
> >> inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1,
> >> sample_id_all: 1, mmap2: 1, comm_exec: 1, context_switch: 1
> >> [root@jouet ~]# perf script
> >> [root@jouet ~]#
> >>
> >> From 'perf report -D'
> >>
> >> 0x1a8 [0x88]: event: 70
> >> .
> >> . ... raw event: size 136 bytes
> >> . 0000: 46 00 00 00 00 00 88 00 01 00 00 00 00 00 00 00 F...............
> >> . 0010: 07 00 00 00 00 00 00 00 1f 00 00 00 00 00 00 00 ................
> >> . 0020: 14 01 58 31 00 00 00 00 69 84 20 d9 d5 3b 00 00 ..X1....i. ..;..
> >> . 0030: 01 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 ................
> >> . 0040: 00 08 00 00 00 00 00 00 03 00 00 00 00 00 00 00 ................
> >> . 0050: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
> >> . 0060: 00 02 00 00 00 00 00 00 00 c0 03 00 00 00 00 00 ................
> >> . 0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> >> . 0080: 02 00 00 00 00 00 00 00 ........
> >>
> >> 0x1a8 [0x88]: PERF_RECORD_AUXTRACE_INFO type: 1
> >> PMU Type 7
> >> Time Shift 31
> >> Time Muliplier 827851028
> >> Time Zero 65789656859753
> >> Cap Time Zero 1
> >> TSC bit 0x400
> >> NoRETComp bit 0x800
> >> Have sched_switch 3
> >> Snapshot mode 0
> >> Per-cpu maps 1
> >> MTC bit 0x200
> >> TSC:CTC numerator 0
> >> TSC:CTC denominator 0
> >> CYC bit 0x2
> >>
> >> [root@jouet ~]# perf report -D | tail -13
> >> Aggregated stats: (excludes AUX area (e.g. instruction trace) decoded /
> >> synthesized events)
> >> TOTAL events: 21753
> >> MMAP events: 130
> >> COMM events: 596
> >> EXIT events: 2
> >> FORK events: 594
> >> MMAP2 events: 8541
> >
> > There ought to be AUX events here, so it looks like no data has been captured,
> > so a problem with recording.
> >
> >> ITRACE_START events: 4
> >> SWITCH_CPU_WIDE events: 11882
> >> FINISHED_ROUND events: 3
> >> AUXTRACE_INFO events: 1
> >> intel_pt//u stats:
> >> dummy:u stats:
> >> [root@jouet ~]#
> >>
> >