Re: [PATCH v2]: perf/core: addressing 4x slowdown during per-process, profiling of STREAM benchmark on Intel Xeon Phi

From: Alexey Budankov
Date: Wed Jun 14 2017 - 08:26:20 EST


On 31.05.2017 3:04, Arun Kalyanasundaram wrote:
Hi Alexey,

I am interested in validating this fix. Can you please share some of
your testcases or let me know if you use any standard OpenMP
benchmarks?

- Arun


Hi Arun,

I am profiling STREAM benchmark running in 272 OpenMP threads. The testcase looks like this:

#!/bin/bash
echo 0 > /proc/sys/kernel/watchdog
echo 1 > /proc/sys/kernel/perf_event_paranoid
/usr/bin/time /usr/bin/perf record -N -B -T -R -d -e cpu/period=0x155cc0,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x3c,in_tx=0x0,ldlat=0x0,umask=0x0,in_tx_cp=0x0,offcore_rsp=0x0/Duk,cpu/period=0x155cc0,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x0,in_tx=0x0,ldlat=0x0,umask=0x3,in_tx_cp=0x0,offcore_rsp=0x0/Duk,cpu/period=0x155cc0,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xc0,in_tx=0x0,ldlat=0x0,umask=0x0,in_tx_cp=0x0,offcore_rsp=0x0/Duk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x3,in_tx=0x0,ldlat=0x0,umask=0x8,in_tx_cp=0x0,offcore_rsp=0x0/ukpp,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x3,in_tx=0x0,ldlat=0x0,umask=0x1,in_tx_cp=0x0,offcore_rsp=0x0/ukpp,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x4,in_tx=0x0,ldlat=0x0,umask=0x2,in_tx_cp=0x0,offcore_rsp=0x0/ukpp,cpu/period=0x186a7,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x4,in_tx=0x0,ldlat=0x0,umask=0x4,in_tx_cp=0x0,offcore_rsp=0x0/ukpp,cpu/period=0x1e8483,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x3c,in_tx=0x0,ldlat=0x0,umask=0x0,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x1e8483,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xc2,in_tx=0x0,ldlat=0x0,umask=0x10,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xca,in_tx=0x0,ldlat=0x0,umask=0x4,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xca,in_tx=0x0,ldlat=0x0,umask=0x90,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x1e8483,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xc2,in_tx=0x0,ldlat=0x0,umask=0x1,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xc3,in_tx=0x0,ldlat=0x0,umask=0x4,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x4,in_tx=0x0,ldlat=0x0,umask=0x20,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x5,in_tx=0x0,ldlat=0x0,umask=0x3,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x1e8483,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xcd,in_tx=0x0,ldlat=0x0,umask=0x1,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x3,in_tx=0x0,ldlat=0x0,umask=0x4,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x86,in_tx=0x0,ldlat=0x0,umask=0x4,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x4,in_tx=0x0,ldlat=0x0,umask=0x10,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x4,in_tx=0x0,ldlat=0x0,umask=0x40,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x4,in_tx=0x0,ldlat=0x0,umask=0x80,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xc2,in_tx=0x0,ldlat=0x0,umask=0x40,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xc2,in_tx=0x0,ldlat=0x0,umask=0x20,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x5,in_tx=0x0,ldlat=0x0,umask=0x2,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xe6,in_tx=0x0,ldlat=0x0,umask=0x1,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xe7,in_tx=0x0,ldlat=0x0,umask=0x1,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xc3,in_tx=0x0,ldlat=0x0,umask=0x1,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0xc3,in_tx=0x0,ldlat=0x0,umask=0x2,in_tx_cp=0x0,offcore_rsp=0x0/uk,cpu/period=0x30d43,pc=0x0,any=0x0,inv=0x0,edge=0x0,cmask=0x0,event=0x4,in_tx=0x0,ldlat=0x0,umask=0x1,in_tx_cp=0x0,offcore_rsp=0x0/uk -- ./stream

-Alexey