Re: [LKP] Re: [perf vendor events] 3f5f0df7bf: perf-sanity-tests.perf_all_metrics_test.fail

From: Carel Si
Date: Wed Apr 13 2022 - 03:06:14 EST


Hi,

On Fri, Mar 04, 2022 at 10:10:53AM -0800, Ian Rogers wrote:
> On Fri, Mar 4, 2022 at 12:33 AM kernel test robot <oliver.sang@xxxxxxxxx> wrote:
> >
> >
> >
> > Greeting,
> >
> > FYI, we noticed the following commit (built with gcc-9):
> >
> > commit: 3f5f0df7bf0f8c48d33d43454fc0b7d0f3ab9537 ("perf vendor events: Update metrics for Skylake")
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> >
> > in testcase: perf-sanity-tests
> > version: perf-x86_64-fb184c4af9b9-1_20220302
> > with following parameters:
> >
> > perf_compiler: clang
> > ucode: 0xec
> >
> >
> >
> > on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz with 32G memory
> >
> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
> Hi,
>
> Thanks for the report! There is no information in the test output that
> I can diagnose the issue with, could you add the -v option to perf
> test so that I can see what the cause is, rather than just pass/fail.

We Added '-v' option, found out that 3f5f0df7bf failed at testing
'Branching_Overhead' [1] and 'IpArith_Scalar_SP' [2], details attached
in perf-sanity-tests.xz

[1]

Testing Branching_Overhead
Metric 'Branching_Overhead' not printed in:
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
Average synthesis took: 459.468 usec (+- 0.265 usec)
Average num. events: 44.000 (+- 0.000)
Average time per event 10.442 usec
Average data synthesis took: 486.181 usec (+- 0.272 usec)
Average num. events: 296.000 (+- 0.000)
Average time per event 1.643 usec

Performance counter stats for 'perf bench internals synthesize':

<not counted> BR_INST_RETIRED.NEAR_CALL (0.00%)
<not counted> BR_INST_RETIRED.NEAR_TAKEN (0.00%)
<not counted> BR_INST_RETIRED.NOT_TAKEN (0.00%)
<not counted> BR_INST_RETIRED.CONDITIONAL (0.00%)
<not counted> CPU_CLK_UNHALTED.THREAD (0.00%)
9772951660 ns duration_time

9.772951660 seconds time elapsed

4.343887000 seconds user
5.248839000 seconds sys


Some events weren't counted. Try disabling the NMI watchdog:
echo 0 > /proc/sys/kernel/nmi_watchdog
perf stat ...
echo 1 > /proc/sys/kernel/nmi_watchdog

[2]

Testing IpArith_Scalar_SP
Metric 'IpArith_Scalar_SP' not printed in:
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
Average synthesis took: 458.601 usec (+- 0.257 usec)
Average num. events: 44.000 (+- 0.000)
Average time per event 10.423 usec
Average data synthesis took: 486.297 usec (+- 0.306 usec)
Average num. events: 296.000 (+- 0.000)
Average time per event 1.643 usec

Performance counter stats for 'perf bench internals synthesize':

108854260048 INST_RETIRED.ANY
0 FP_ARITH_INST_RETIRED.SCALAR_SINGLE
9750270760 ns duration_time

9.750270760 seconds time elapsed

4.288438000 seconds user
5.323337000 seconds sys

Thanks

> At the time of filing the update I didn't have access to a Skylake
> machine (just SkylakeX) but this test was ran as detailed in the
> commit message:
> https://lore.kernel.org/lkml/20220201015858.1226914-21-irogers@xxxxxxxxxx/
> Knowing the test, I suspect there may be a bad event on Skylake, but
> can't confirm this because I lack the hardware and/or the test output.
> The issue may also be how the test was run, such as not as root, not
> in a container. There is a further issue with this test that metrics
> (e.g. number of vector ops) that measure things that a simple
> benchmark doesn't cause counts for can fail the test, as the test is
> checking if the metric is reported - for example, there may be no
> vector ops within the simple benchmark.
>
> Thanks,
> Ian
>
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> >
> >
> >
> > 2022-03-02 19:01:56 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-func-3f5f0df7bf0f8c48d33d43454fc0b7d0f3ab9537/tools/perf/perf test 89
> > 89: perf all metricgroups test : Ok
> > 2022-03-02 19:02:05 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-func-3f5f0df7bf0f8c48d33d43454fc0b7d0f3ab9537/tools/perf/perf test 90
> > 90: perf all metrics test : FAILED!
> > 2022-03-02 19:07:00 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-func-3f5f0df7bf0f8c48d33d43454fc0b7d0f3ab9537/tools/perf/perf test 91
> > 91: perf all PMU test : Ok
> >
> >
> >
> > To reproduce:
> >
> > git clone https://github.com/intel/lkp-tests.git
> > cd lkp-tests
> > sudo bin/lkp install job.yaml # job file is attached in this email
> > bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
> > sudo bin/lkp run generated-yaml-file
> >
> > # if come across any failure that blocks the test,
> > # please remove ~/.lkp and /lkp dir to run from a clean state.
> >
> >
> >
> > ---
> > 0DAY/LKP+ Test Infrastructure Open Source Technology Center
> > https://lists.01.org/hyperkitty/list/lkp@xxxxxxxxxxxx Intel Corporation
> >
> > Thanks,
> > Oliver Sang
> >
> _______________________________________________
> LKP mailing list -- lkp@xxxxxxxxxxxx
> To unsubscribe send an email to lkp-leave@xxxxxxxxxxxx

Attachment: perf-sanity-tests.xz
Description: application/xz