Re: [linus:master] [apparmor] 1ad22fcc4d: stress-ng.kill.ops_per_sec -42.5% regression

From: Yin, Fengwei
Date: Fri Jan 27 2023 - 20:37:37 EST


Hi John,

On 12/31/2022 3:18 PM, kernel test robot wrote:
>
> Greeting,
>
> FYI, we noticed a -42.5% regression of stress-ng.kill.ops_per_sec due to commit:
>
>
> commit: 1ad22fcc4d0d2fb2e0f35aed555a86d016d5e590 ("apparmor: rework profile->rules to be a list")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> in testcase: stress-ng
> on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory
> with following parameters:
>
> nr_threads: 10%
> disk: 1HDD
> testtime: 60s
> fs: ext4
> class: os
> test: kill
> cpufreq_governor: performance
Do you think any other information need be collected for this regression
report? Thanks.


Regards
Yin, Fengwei


>
>
>
>
> If you fix the issue, kindly add following tag
> | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> | Link: https://lore.kernel.org/oe-lkp/202212311546.755a3ed7-oliver.sang@xxxxxxxxx
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> sudo bin/lkp install job.yaml # job file is attached in this email
> bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
> sudo bin/lkp run generated-yaml-file
>
> # if come across any failure that blocks the test,
> # please remove ~/.lkp and /lkp dir to run from a clean state.
>
> =========================================================================================
> class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> os/gcc-11/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp7/kill/stress-ng/60s
>
> commit:
> 217af7e2f4 ("apparmor: refactor profile rules and attachments")
> 1ad22fcc4d ("apparmor: rework profile->rules to be a list")
>
> 217af7e2f4deb629 1ad22fcc4d0d2fb2e0f35aed555
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 2820 ± 2% -42.7% 1616 stress-ng.kill.kill_calls_per_sec
> 511090 -42.5% 293660 stress-ng.kill.ops
> 8518 -42.5% 4894 stress-ng.kill.ops_per_sec
> 3778 ± 2% -58.4% 1571 ± 3% stress-ng.time.involuntary_context_switches
> 859945 ± 3% -29.7% 604857 ± 2% stress-ng.time.voluntary_context_switches
> 70351 ± 12% +22.0% 85795 ± 8% meminfo.AnonHugePages
> 0.05 ± 7% -22.6% 0.04 turbostat.IPC
> 28627 ± 3% -27.0% 20903 ± 2% vmstat.system.cs
> 128187 ± 92% +78.6% 228921 ± 49% sched_debug.cfs_rq:/.min_vruntime.max
> 20901 ±107% +124.7% 46965 ± 59% sched_debug.cfs_rq:/.min_vruntime.stddev
> -24201 -149.5% 11969 ± 74% sched_debug.cfs_rq:/.spread0.avg
> 69818 ±123% +170.9% 189126 ± 59% sched_debug.cfs_rq:/.spread0.max
> 20904 ±107% +124.7% 46965 ± 59% sched_debug.cfs_rq:/.spread0.stddev
> 1.19 ± 8% +6.3 7.48 ± 10% perf-profile.calltrace.cycles-pp.aa_may_signal.apparmor_task_kill.security_task_kill.kill_something_info.__x64_sys_kill
> 0.58 ± 8% -0.2 0.40 ± 13% perf-profile.children.cycles-pp.check_kill_permission
> 0.30 ± 12% -0.1 0.16 ± 8% perf-profile.children.cycles-pp.profile_signal_perm
> 0.20 ± 10% -0.1 0.12 ± 13% perf-profile.children.cycles-pp.audit_signal_info_syscall
> 0.21 ± 23% -0.1 0.14 ± 12% perf-profile.children.cycles-pp.pause
> 0.18 ± 16% -0.1 0.11 ± 20% perf-profile.children.cycles-pp.__task_pid_nr_ns
> 0.15 ± 10% -0.1 0.09 ± 14% perf-profile.children.cycles-pp.__kill_pgrp_info
> 0.16 ± 13% -0.1 0.10 ± 14% perf-profile.children.cycles-pp.exit_to_user_mode_loop
> 0.16 ± 15% -0.1 0.11 ± 12% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
> 0.10 ± 9% -0.0 0.06 ± 11% perf-profile.children.cycles-pp.audit_signal_info
> 0.07 ± 18% -0.0 0.04 ± 71% perf-profile.children.cycles-pp.alloc_empty_file
> 0.07 ± 18% -0.0 0.04 ± 71% perf-profile.children.cycles-pp.__alloc_file
> 0.11 ± 7% -0.0 0.08 ± 16% perf-profile.children.cycles-pp.do_send_sig_info
> 0.10 ± 12% -0.0 0.06 ± 14% perf-profile.children.cycles-pp.__send_signal_locked
> 0.10 ± 17% -0.0 0.06 ± 19% perf-profile.children.cycles-pp.arch_do_signal_or_restart
> 0.06 ± 14% +0.0 0.11 ± 18% perf-profile.children.cycles-pp.ns_capable
> 0.06 ± 14% +0.0 0.11 ± 18% perf-profile.children.cycles-pp.security_capable
> 0.06 ± 14% +0.0 0.11 ± 18% perf-profile.children.cycles-pp.apparmor_capable
> 0.09 ± 11% +0.0 0.14 ± 23% perf-profile.children.cycles-pp.kernel_clone
> 0.09 ± 8% +0.0 0.14 ± 24% perf-profile.children.cycles-pp.copy_process
> 0.04 ± 71% +0.0 0.08 ± 18% perf-profile.children.cycles-pp.__do_sys_clone
> 0.07 ± 18% +0.1 0.16 ± 9% perf-profile.children.cycles-pp.__x64_sys_exit_group
> 0.07 ± 18% +0.1 0.16 ± 9% perf-profile.children.cycles-pp.do_group_exit
> 0.07 ± 18% +0.1 0.16 ± 9% perf-profile.children.cycles-pp.do_exit
> 0.06 ± 47% +0.1 0.15 ± 8% perf-profile.children.cycles-pp.exit_notify
> 0.29 ± 12% +0.1 0.38 ± 11% perf-profile.children.cycles-pp.queued_read_lock_slowpath
> 0.15 ± 10% +0.1 0.28 ± 13% perf-profile.children.cycles-pp.queued_write_lock_slowpath
> 0.36 ± 10% +0.1 0.50 ± 10% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> 1.24 ± 8% +6.3 7.56 ± 10% perf-profile.children.cycles-pp.aa_may_signal
> 0.28 ± 12% -0.1 0.15 ± 10% perf-profile.self.cycles-pp.profile_signal_perm
> 0.23 ± 16% -0.1 0.12 ± 17% perf-profile.self.cycles-pp.check_kill_permission
> 0.20 ± 10% -0.1 0.12 ± 13% perf-profile.self.cycles-pp.audit_signal_info_syscall
> 0.15 ± 11% -0.1 0.08 ± 13% perf-profile.self.cycles-pp.kill_something_info
> 0.16 ± 6% -0.1 0.10 ± 20% perf-profile.self.cycles-pp.security_task_kill
> 0.14 ± 11% -0.1 0.09 ± 14% perf-profile.self.cycles-pp.__kill_pgrp_info
> 0.16 ± 15% -0.1 0.10 ± 22% perf-profile.self.cycles-pp.__task_pid_nr_ns
> 0.08 ± 13% -0.1 0.04 ± 71% perf-profile.self.cycles-pp.audit_signal_info
> 0.06 ± 14% +0.0 0.11 ± 16% perf-profile.self.cycles-pp.apparmor_capable
> 0.36 ± 10% +0.1 0.50 ± 10% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
> 0.96 ± 8% +6.4 7.38 ± 10% perf-profile.self.cycles-pp.aa_may_signal
> 16.17 +18.8% 19.21 perf-stat.i.MPKI
> 8.529e+08 -30.2% 5.952e+08 perf-stat.i.branch-instructions
> 0.68 ± 3% +0.1 0.79 ± 2% perf-stat.i.branch-miss-rate%
> 7256065 ± 3% -7.7% 6698864 ± 2% perf-stat.i.branch-misses
> 13.08 ± 3% +6.0 19.07 perf-stat.i.cache-miss-rate%
> 9361692 ± 2% +16.3% 10884924 ± 2% perf-stat.i.cache-misses
> 71450858 -20.7% 56687571 ± 2% perf-stat.i.cache-references
> 29596 ± 3% -27.5% 21452 ± 2% perf-stat.i.context-switches
> 6.21 +50.4% 9.34 perf-stat.i.cpi
> 2974 ± 2% -14.4% 2547 ± 2% perf-stat.i.cycles-between-cache-misses
> 0.02 ± 13% +0.0 0.03 ± 6% perf-stat.i.dTLB-load-miss-rate%
> 1.408e+09 -32.2% 9.544e+08 perf-stat.i.dTLB-loads
> 7.763e+08 -33.2% 5.183e+08 perf-stat.i.dTLB-stores
> 34.02 -3.0 31.06 perf-stat.i.iTLB-load-miss-rate%
> 1109352 ± 2% -18.6% 903511 perf-stat.i.iTLB-load-misses
> 2150307 -6.8% 2003881 perf-stat.i.iTLB-loads
> 4.552e+09 -30.7% 3.153e+09 perf-stat.i.instructions
> 4134 ± 2% -15.1% 3508 perf-stat.i.instructions-per-iTLB-miss
> 0.19 -26.4% 0.14 perf-stat.i.ipc
> 814.65 -16.5% 680.21 perf-stat.i.metric.K/sec
> 31.63 -31.9% 21.53 perf-stat.i.metric.M/sec
> 80.72 ± 3% +9.8 90.56 perf-stat.i.node-load-miss-rate%
> 325431 ± 5% +970.6% 3483935 ± 2% perf-stat.i.node-load-misses
> 79936 ± 19% +323.7% 338676 ± 6% perf-stat.i.node-loads
> 4260901 ± 2% -34.5% 2792405 perf-stat.i.node-store-misses
> 15.70 +14.6% 18.00 perf-stat.overall.MPKI
> 0.85 ± 2% +0.3 1.12 ± 2% perf-stat.overall.branch-miss-rate%
> 13.11 ± 2% +6.1 19.20 perf-stat.overall.cache-miss-rate%
> 6.03 +44.8% 8.73 perf-stat.overall.cpi
> 2931 -13.8% 2527 ± 2% perf-stat.overall.cycles-between-cache-misses
> 0.02 ± 10% +0.0 0.03 ± 7% perf-stat.overall.dTLB-load-miss-rate%
> 34.03 -3.0 31.07 perf-stat.overall.iTLB-load-miss-rate%
> 4104 ± 2% -15.0% 3487 perf-stat.overall.instructions-per-iTLB-miss
> 0.17 -30.9% 0.11 perf-stat.overall.ipc
> 80.44 ± 3% +10.7 91.14 perf-stat.overall.node-load-miss-rate%
> 8.39e+08 -30.3% 5.852e+08 perf-stat.ps.branch-instructions
> 7111005 ± 2% -7.8% 6555177 ± 2% perf-stat.ps.branch-misses
> 9215998 ± 2% +16.3% 10714798 ± 2% perf-stat.ps.cache-misses
> 70327211 -20.7% 55796363 ± 2% perf-stat.ps.cache-references
> 29136 ± 3% -27.5% 21118 ± 2% perf-stat.ps.context-switches
> 1.385e+09 -32.2% 9.386e+08 perf-stat.ps.dTLB-loads
> 7.64e+08 -33.3% 5.099e+08 perf-stat.ps.dTLB-stores
> 1091797 ± 2% -18.6% 888828 perf-stat.ps.iTLB-load-misses
> 2116297 -6.8% 1971620 perf-stat.ps.iTLB-loads
> 4.478e+09 -30.8% 3.1e+09 perf-stat.ps.instructions
> 320258 ± 5% +970.9% 3429628 ± 2% perf-stat.ps.node-load-misses
> 78661 ± 19% +323.8% 333386 ± 6% perf-stat.ps.node-loads
> 4195116 ± 2% -34.5% 2749032 perf-stat.ps.node-store-misses
> 2.847e+11 -31.2% 1.959e+11 perf-stat.total.instructions
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>