Re: [PATCH v2 05/19] perf, tools: Support weak groups

From: Jiri Olsa
Date: Tue Aug 22 2017 - 04:36:18 EST


On Fri, Aug 11, 2017 at 04:26:20PM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@xxxxxxxxxxxxxxx>
>
> Setting up groups can be complicated due to the
> complicated scheduling restrictions of different PMUs.
> User tools usually don't understand all these restrictions.
> Still in many cases it is useful to set up groups and
> they work most of the time. However if the group
> is set up wrong some members will not reported any values
> because they never get scheduled.
>
> Add a concept of a 'weak group': try to set up a group,
> but if it's not schedulable fallback to not using
> a group. That gives us the best of both worlds:
> groups if they work, but still a usable fallback if they don't.
>
> In theory it would be possible to have more complex fallback
> strategies (e.g. try to split the group in half), but
> the simple fallback of not using a group seems to work for now.
>
> So far the weak group is only implemented for perf stat,
> not for record.
>
> Here's an unschedulable group (on IvyBridge with SMT on)
>
> % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1
>
> 73,806,067 branches
> 4,848,144 branch-misses # 6.57% of all branches
> 14,754,458 l1d.replacement
> 24,905,558 l2_lines_in.all
> <not supported> l2_rqsts.all_code_rd <------- will never report anything

also if I put 'cycles' instead of the l2_rqsts.all_code_rd,
I get clean open but 'not counted' as result.. I wonder
there's some counter scheduling issue

[root@krava perf]# ./perf stat -v -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,cycles}:W' -a sleep 1
Using CPUID GenuineIntel-6-3D
l1d.replacement -> cpu/umask=0x1,period=2000003,event=0x51/
l2_lines_in.all -> cpu/umask=0x7,period=100003,event=0xf1/
branches: 0 4004293853 0
branch-misses: 0 4004293853 0
l1d.replacement: 0 4004293853 0
l2_lines_in.all: 0 4004293853 0
cycles: 0 4004293853 0

Performance counter stats for 'system wide':

<not counted> branches (0.00%)
<not counted> branch-misses (0.00%)
<not counted> l1d.replacement (0.00%)
<not counted> l2_lines_in.all (0.00%)
<not counted> cycles (0.00%)

1.001088589 seconds time elapsed

jirka