Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

From: Fengguang Wu
Date: Mon Aug 15 2016 - 10:16:00 EST


Hi Christoph,

On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
Snipping the long contest:

I think there are three observations here:

(1) removing the mark_page_accessed (which is the only significant
change in the parent commit) hurts the
aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
I'd still rather stick to the filemap version and let the
VM people sort it out. How do the numbers for this test
look for XFS vs say ext4 and btrfs?
(2) lots of additional spinlock contention in the new case. A quick
check shows that I fat-fingered my rewrite so that we do
the xfs_inode_set_eofblocks_tag call now for the pure lookup
case, and pretty much all new cycles come from that.
(3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
we're already doing way to many even without my little bug above.

So I've force pushed a new version of the iomap-fixes branch with
(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
lot less expensive slotted in before that. Would be good to see
the numbers with that.

The aim7 1BRD tests finished and there are ups and downs, with overall
performance remain flat.

99091700659f4df9 74a242ad94d13436a1644c0b45 bf4dc6e4ecc2a3d042029319bc testcase/testparams/testbox
---------------- -------------------------- -------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \ 159926 157324 158574 GEO-MEAN aim7.jobs-per-min
70897 5% 74137 4% 73775 aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
485217 Â 3% 492431 477533 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
360451 -19% 292980 -17% 299377 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
338114 338410 5% 354078 aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
60130 Â 5% 4% 62438 5% 62923 aim7/1BRD_48G-xfs-disk_src-3000-performance/ivb44
403144 397790 410648 aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44
26327 26534 26128 aim7/1BRD_48G-xfs-sync_disk_rw-600-performance/ivb44

The new commit bf4dc6e ("xfs: rewrite and optimize the delalloc write
path") improves the aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
case by 5%. Here are the detailed numbers:

aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44

74a242ad94d13436 bf4dc6e4ecc2a3d042029319bc
---------------- --------------------------
%stddev %change %stddev
\ | \
338410 5% 354078 aim7.jobs-per-min
404390 8% 435117 aim7.time.voluntary_context_switches
2502 -4% 2396 aim7.time.maximum_resident_set_size
15018 -9% 13701 aim7.time.involuntary_context_switches
900 -11% 801 aim7.time.system_time
17432 11% 19365 vmstat.system.cs
47736 Â 19% -24% 36087 interrupts.CAL:Function_call_interrupts
2129646 31% 2790638 proc-vmstat.pgalloc_dma32
379503 13% 429384 numa-meminfo.node0.Dirty
15018 -9% 13701 time.involuntary_context_switches
900 -11% 801 time.system_time
1560 10% 1716 slabinfo.mnt_cache.active_objs
1560 10% 1716 slabinfo.mnt_cache.num_objs
61.53 -4 57.45 Â 4% perf-profile.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
61.63 -4 57.55 Â 4% perf-profile.func.cycles-pp.intel_idle
1007188 Â 16% 156% 2577911 Â 6% numa-numastat.node0.numa_miss
9662857 Â 4% -13% 8420159 Â 3% numa-numastat.node0.numa_foreign
1008220 Â 16% 155% 2570630 Â 6% numa-numastat.node1.numa_foreign
9664033 Â 4% -13% 8413184 Â 3% numa-numastat.node1.numa_miss
26519887 Â 3% 18% 31322674 cpuidle.C1-IVT.time
122238 16% 142383 cpuidle.C1-IVT.usage
46548 11% 51645 cpuidle.C1E-IVT.usage
17253419 13% 19567582 cpuidle.C3-IVT.time
86847 13% 98333 cpuidle.C3-IVT.usage
482033 Â 12% 108% 1000665 Â 8% numa-vmstat.node0.numa_miss
94689 14% 107744 numa-vmstat.node0.nr_zone_write_pending
94677 14% 107718 numa-vmstat.node0.nr_dirty
3156643 Â 3% -20% 2527460 Â 3% numa-vmstat.node0.numa_foreign
429288 Â 12% 129% 983053 Â 8% numa-vmstat.node1.numa_foreign
3104193 Â 3% -19% 2510128 numa-vmstat.node1.numa_miss
6.43 Â 5% 51% 9.70 Â 11% turbostat.Pkg%pc2
0.30 28% 0.38 turbostat.CPU%c3
9.71 9.92 turbostat.RAMWatt
158 154 turbostat.PkgWatt
125 -3% 121 turbostat.CorWatt
1141 -6% 1078 turbostat.Avg_MHz
38.70 -6% 36.48 turbostat.%Busy
5.03 Â 11% -51% 2.46 Â 40% turbostat.Pkg%pc6
8.33 Â 48% 88% 15.67 Â 36% sched_debug.cfs_rq:/.runnable_load_avg.max
1947 Â 3% -12% 1710 Â 7% sched_debug.cfs_rq:/.spread0.stddev
1936 Â 3% -12% 1698 Â 8% sched_debug.cfs_rq:/.min_vruntime.stddev
2170 Â 10% -14% 1863 Â 6% sched_debug.cfs_rq:/.load_avg.max
220926 Â 18% 37% 303192 Â 5% sched_debug.cpu.avg_idle.stddev
0.06 Â 13% 357% 0.28 Â 23% sched_debug.rt_rq:/.rt_time.avg
0.37 Â 10% 240% 1.25 Â 15% sched_debug.rt_rq:/.rt_time.stddev
2.54 Â 10% 160% 6.59 Â 10% sched_debug.rt_rq:/.rt_time.max
0.32 Â 19% 29% 0.42 Â 10% perf-stat.dTLB-load-miss-rate
964727 7% 1028830 perf-stat.context-switches
176406 4% 184289 perf-stat.cpu-migrations
0.29 4% 0.30 perf-stat.branch-miss-rate
1.634e+09 1.673e+09 perf-stat.node-store-misses
23.60 23.99 perf-stat.node-store-miss-rate
40.01 40.57 perf-stat.cache-miss-rate
0.95 -8% 0.87 perf-stat.ipc
3.203e+12 -9% 2.928e+12 perf-stat.cpu-cycles
1.506e+09 -11% 1.345e+09 perf-stat.branch-misses
50.64 Â 13% -14% 43.45 Â 4% perf-stat.iTLB-load-miss-rate
5.285e+11 -14% 4.523e+11 perf-stat.branch-instructions
3.042e+12 -16% 2.551e+12 perf-stat.instructions
7.996e+11 -18% 6.584e+11 perf-stat.dTLB-loads
5.569e+11 Â 4% -18% 4.578e+11 perf-stat.dTLB-stores


Here are the detailed numbers for the slowed down case:

aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44

99091700659f4df9 bf4dc6e4ecc2a3d042029319bc
---------------- --------------------------
%stddev change %stddev
\ | \
360451 -17% 299377 aim7.jobs-per-min
12806 481% 74447 aim7.time.involuntary_context_switches
755 44% 1086 aim7.time.system_time
50.17 20% 60.36 aim7.time.elapsed_time
50.17 20% 60.36 aim7.time.elapsed_time.max
438148 446012 aim7.time.voluntary_context_switches
37798 Â 16% 780% 332583 Â 8% interrupts.CAL:Function_call_interrupts
78.82 Â 5% 18% 93.35 Â 5% uptime.boot
2847 Â 7% 11% 3160 Â 7% uptime.idle
147490 Â 8% 34% 197261 Â 3% softirqs.RCU
648159 29% 839283 softirqs.TIMER
160830 10% 177144 softirqs.SCHED
3845352 Â 4% 91% 7349133 numa-numastat.node0.numa_miss
4686838 Â 5% 67% 7835640 numa-numastat.node0.numa_foreign
3848455 Â 4% 91% 7352436 numa-numastat.node1.numa_foreign
4689920 Â 5% 67% 7838734 numa-numastat.node1.numa_miss
50.17 20% 60.36 time.elapsed_time.max
12806 481% 74447 time.involuntary_context_switches
755 44% 1086 time.system_time
50.17 20% 60.36 time.elapsed_time
1563 18% 1846 time.percent_of_cpu_this_job_got
11699 Â 19% 3738% 449048 vmstat.io.bo
18836969 -16% 15789996 vmstat.memory.free
16 19% 19 vmstat.procs.r
19377 459% 108364 vmstat.system.cs
48255 11% 53537 vmstat.system.in
2357299 25% 2951384 meminfo.Inactive(file)
2366381 25% 2960468 meminfo.Inactive
1575292 -9% 1429971 meminfo.Cached
19342499 -17% 16100340 meminfo.MemFree
1057904 -20% 842987 meminfo.Dirty
1057 21% 1284 turbostat.Avg_MHz
35.78 21% 43.24 turbostat.%Busy
9.95 15% 11.47 turbostat.RAMWatt
74 Â 5% 10% 81 turbostat.CoreTmp
74 Â 4% 10% 81 turbostat.PkgTmp
118 8% 128 turbostat.CorWatt
151 7% 162 turbostat.PkgWatt
29.06 -23% 22.39 turbostat.CPU%c6
487 Â 89% 3e+04 26448 Â 57% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
1823 Â 82% 2e+06 1913796 Â 38% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
208475 Â 43% 1e+06 1409494 Â 5% latency_stats.sum.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
6884 Â 73% 8e+04 90790 Â 9% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.SyS_write
1598 Â 20% 3e+04 35015 Â 27% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_free_eofblocks.xfs_release.xfs_file_release.__fput.____fput.task_work_run
2006 Â 25% 3e+04 31143 Â 35% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
29 Â101% 1e+04 10214 Â 29% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_defer_trans_roll.xfs_defer_finish.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode
1206 Â 51% 9e+03 9919 Â 25% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.touch_atime.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.SyS_read
29869205 Â 4% -10% 26804569 cpuidle.C1-IVT.time
5737726 39% 7952214 cpuidle.C1E-IVT.time
51141 17% 59958 cpuidle.C1E-IVT.usage
18377551 37% 25176426 cpuidle.C3-IVT.time
96067 17% 112045 cpuidle.C3-IVT.usage
1806811 12% 2024041 cpuidle.C6-IVT.usage
1104420 Â 36% 204% 3361085 Â 27% cpuidle.POLL.time
281 Â 10% 20% 338 cpuidle.POLL.usage
5.61 Â 11% -0.5 5.12 Â 18% perf-profile.cycles-pp.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
5.85 Â 6% -0.8 5.06 Â 15% perf-profile.cycles-pp.hrtimer_interrupt.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter
6.32 Â 6% -0.9 5.42 Â 15% perf-profile.cycles-pp.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
15.77 Â 8% -2 13.83 Â 17% perf-profile.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry
16.04 Â 8% -2 14.01 Â 15% perf-profile.cycles-pp.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
60.25 Â 4% -7 53.03 Â 7% perf-profile.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
60.41 Â 4% -7 53.12 Â 7% perf-profile.func.cycles-pp.intel_idle
1174104 22% 1436859 numa-meminfo.node0.Inactive
1167471 22% 1428271 numa-meminfo.node0.Inactive(file)
770811 -9% 698147 numa-meminfo.node0.FilePages
20707294 -12% 18281509 Â 6% numa-meminfo.node0.Active
20613745 -12% 18180987 Â 6% numa-meminfo.node0.Active(file)
9676639 -17% 8003627 numa-meminfo.node0.MemFree
509906 -22% 396192 numa-meminfo.node0.Dirty
1189539 28% 1524697 numa-meminfo.node1.Inactive(file)
1191989 28% 1525194 numa-meminfo.node1.Inactive
804508 -10% 727067 numa-meminfo.node1.FilePages
9654540 -16% 8077810 numa-meminfo.node1.MemFree
547956 -19% 441933 numa-meminfo.node1.Dirty
396 Â 12% 485% 2320 Â 37% slabinfo.bio-1.num_objs
396 Â 12% 481% 2303 Â 37% slabinfo.bio-1.active_objs
73 140% 176 Â 14% slabinfo.kmalloc-128.active_slabs
73 140% 176 Â 14% slabinfo.kmalloc-128.num_slabs
4734 94% 9171 Â 11% slabinfo.kmalloc-128.num_objs
4734 88% 8917 Â 13% slabinfo.kmalloc-128.active_objs
16238 -10% 14552 Â 3% slabinfo.kmalloc-256.active_objs
17189 -13% 15033 Â 3% slabinfo.kmalloc-256.num_objs
20651 96% 40387 Â 17% slabinfo.radix_tree_node.active_objs
398 91% 761 Â 17% slabinfo.radix_tree_node.active_slabs
398 91% 761 Â 17% slabinfo.radix_tree_node.num_slabs
22313 91% 42650 Â 17% slabinfo.radix_tree_node.num_objs
32 638% 236 Â 28% slabinfo.xfs_efd_item.active_slabs
32 638% 236 Â 28% slabinfo.xfs_efd_item.num_slabs
1295 281% 4934 Â 23% slabinfo.xfs_efd_item.num_objs
1295 280% 4923 Â 23% slabinfo.xfs_efd_item.active_objs
1661 81% 3000 Â 42% slabinfo.xfs_log_ticket.num_objs
1661 78% 2952 Â 42% slabinfo.xfs_log_ticket.active_objs
2617 49% 3905 Â 30% slabinfo.xfs_trans.num_objs
2617 48% 3870 Â 31% slabinfo.xfs_trans.active_objs
1015933 567% 6779099 perf-stat.context-switches
4.864e+08 126% 1.101e+09 perf-stat.node-load-misses
1.179e+09 103% 2.399e+09 perf-stat.node-loads
0.06 Â 34% 92% 0.12 Â 11% perf-stat.dTLB-store-miss-rate
2.985e+08 Â 32% 86% 5.542e+08 Â 11% perf-stat.dTLB-store-misses
2.551e+09 Â 15% 81% 4.625e+09 Â 13% perf-stat.dTLB-load-misses
0.39 Â 14% 66% 0.65 Â 13% perf-stat.dTLB-load-miss-rate
1.26e+09 60% 2.019e+09 perf-stat.node-store-misses
46072661 Â 27% 49% 68472915 perf-stat.iTLB-loads
2.738e+12 Â 4% 43% 3.916e+12 perf-stat.cpu-cycles
21.48 32% 28.35 perf-stat.node-store-miss-rate
1.612e+10 Â 3% 28% 2.066e+10 perf-stat.cache-references
1.669e+09 Â 3% 24% 2.063e+09 perf-stat.branch-misses
6.816e+09 Â 3% 20% 8.179e+09 perf-stat.cache-misses
177699 18% 209145 perf-stat.cpu-migrations
0.39 13% 0.44 perf-stat.branch-miss-rate
4.606e+09 11% 5.102e+09 perf-stat.node-stores
4.329e+11 Â 4% 9% 4.727e+11 perf-stat.branch-instructions
6.458e+11 9% 7.046e+11 perf-stat.dTLB-loads
29.19 8% 31.45 perf-stat.node-load-miss-rate
286173 8% 308115 perf-stat.page-faults
286191 8% 308109 perf-stat.minor-faults
45084934 4% 47073719 perf-stat.iTLB-load-misses
42.28 -6% 39.58 perf-stat.cache-miss-rate
50.62 Â 16% -19% 40.75 perf-stat.iTLB-load-miss-rate
0.89 -28% 0.64 perf-stat.ipc
2 Â 36% 4e+07% 970191 proc-vmstat.pgrotated
150 Â 21% 1e+07% 15356485 Â 3% proc-vmstat.nr_vmscan_immediate_reclaim
76823 Â 35% 56899% 43788651 proc-vmstat.pgscan_direct
153407 Â 19% 4483% 7031431 proc-vmstat.nr_written
619699 Â 19% 4441% 28139689 proc-vmstat.pgpgout
5342421 1061% 62050709 proc-vmstat.pgactivate
47 Â 25% 354% 217 proc-vmstat.nr_pages_scanned
8542963 Â 3% 78% 15182914 proc-vmstat.numa_miss
8542963 Â 3% 78% 15182715 proc-vmstat.numa_foreign
2820568 31% 3699073 proc-vmstat.pgalloc_dma32
589234 25% 738160 proc-vmstat.nr_zone_inactive_file
589240 25% 738155 proc-vmstat.nr_inactive_file
61347830 13% 69522958 proc-vmstat.pgfree
393711 -9% 356981 proc-vmstat.nr_file_pages
4831749 -17% 4020131 proc-vmstat.nr_free_pages
61252784 -18% 50183773 proc-vmstat.pgrefill
61245420 -18% 50176301 proc-vmstat.pgdeactivate
264397 -20% 210222 proc-vmstat.nr_zone_write_pending
264367 -20% 210188 proc-vmstat.nr_dirty
60420248 -39% 36646178 proc-vmstat.pgscan_kswapd
60373976 -44% 33735064 proc-vmstat.pgsteal_kswapd
1753 -98% 43 Â 18% proc-vmstat.pageoutrun
1095 -98% 25 Â 17% proc-vmstat.kswapd_low_wmark_hit_quickly
656 Â 3% -98% 15 Â 24% proc-vmstat.kswapd_high_wmark_hit_quickly
0 1136221 numa-vmstat.node0.workingset_refault
0 1136221 numa-vmstat.node0.workingset_activate
23 Â 45% 1e+07% 2756907 numa-vmstat.node0.nr_vmscan_immediate_reclaim
37618 Â 24% 3234% 1254165 numa-vmstat.node0.nr_written
1346538 Â 4% 104% 2748439 numa-vmstat.node0.numa_miss
1577620 Â 5% 80% 2842882 numa-vmstat.node0.numa_foreign
291242 23% 357407 numa-vmstat.node0.nr_inactive_file
291237 23% 357390 numa-vmstat.node0.nr_zone_inactive_file
13961935 12% 15577331 numa-vmstat.node0.numa_local
13961938 12% 15577332 numa-vmstat.node0.numa_hit
39831 10% 43768 numa-vmstat.node0.nr_unevictable
39831 10% 43768 numa-vmstat.node0.nr_zone_unevictable
193467 -10% 174639 numa-vmstat.node0.nr_file_pages
5147212 -12% 4542321 Â 6% numa-vmstat.node0.nr_active_file
5147237 -12% 4542325 Â 6% numa-vmstat.node0.nr_zone_active_file
2426129 -17% 2008637 numa-vmstat.node0.nr_free_pages
128285 -23% 99206 numa-vmstat.node0.nr_zone_write_pending
128259 -23% 99183 numa-vmstat.node0.nr_dirty
0 1190594 numa-vmstat.node1.workingset_refault
0 1190594 numa-vmstat.node1.workingset_activate
21 Â 36% 1e+07% 3120425 Â 4% numa-vmstat.node1.nr_vmscan_immediate_reclaim
38541 Â 26% 3336% 1324185 numa-vmstat.node1.nr_written
1316819 Â 4% 105% 2699075 numa-vmstat.node1.numa_foreign
1547929 Â 4% 80% 2793491 numa-vmstat.node1.numa_miss
296714 28% 381124 numa-vmstat.node1.nr_zone_inactive_file
296714 28% 381123 numa-vmstat.node1.nr_inactive_file
14311131 10% 15750908 numa-vmstat.node1.numa_hit
14311130 10% 15750905 numa-vmstat.node1.numa_local
201164 -10% 181742 numa-vmstat.node1.nr_file_pages
2422825 -16% 2027750 numa-vmstat.node1.nr_free_pages
137069 -19% 110501 numa-vmstat.node1.nr_zone_write_pending
137069 -19% 110497 numa-vmstat.node1.nr_dirty
737 Â 29% 27349% 202387 sched_debug.cfs_rq:/.min_vruntime.min
3637 Â 20% 7919% 291675 sched_debug.cfs_rq:/.min_vruntime.avg
11.00 Â 44% 4892% 549.17 Â 9% sched_debug.cfs_rq:/.runnable_load_avg.max
2.12 Â 36% 4853% 105.12 Â 5% sched_debug.cfs_rq:/.runnable_load_avg.stddev
1885 Â 6% 4189% 80870 sched_debug.cfs_rq:/.min_vruntime.stddev
1896 Â 6% 4166% 80895 sched_debug.cfs_rq:/.spread0.stddev
10774 Â 13% 4113% 453925 sched_debug.cfs_rq:/.min_vruntime.max
1.02 Â 19% 2630% 27.72 Â 7% sched_debug.cfs_rq:/.runnable_load_avg.avg
63060 Â 45% 776% 552157 sched_debug.cfs_rq:/.load.max
14442 Â 21% 590% 99615 Â 14% sched_debug.cfs_rq:/.load.stddev
8397 Â 9% 309% 34370 Â 12% sched_debug.cfs_rq:/.load.avg
46.02 Â 24% 176% 126.96 Â 6% sched_debug.cfs_rq:/.util_avg.stddev
817 19% 974 Â 3% sched_debug.cfs_rq:/.util_avg.max
721 -17% 600 Â 3% sched_debug.cfs_rq:/.util_avg.avg
595 Â 11% -38% 371 Â 7% sched_debug.cfs_rq:/.util_avg.min
1484 Â 20% -47% 792 Â 5% sched_debug.cfs_rq:/.load_avg.min
1798 Â 4% -50% 903 Â 5% sched_debug.cfs_rq:/.load_avg.avg
322 Â 8% 7726% 25239 Â 8% sched_debug.cpu.nr_switches.min
969 7238% 71158 sched_debug.cpu.nr_switches.avg
2.23 Â 40% 4650% 106.14 Â 4% sched_debug.cpu.cpu_load[0].stddev
943 Â 4% 3475% 33730 Â 3% sched_debug.cpu.nr_switches.stddev
0.87 Â 25% 3057% 27.46 Â 7% sched_debug.cpu.cpu_load[0].avg
5.43 Â 13% 2232% 126.61 sched_debug.cpu.nr_uninterruptible.stddev
6131 Â 3% 2028% 130453 sched_debug.cpu.nr_switches.max
1.58 Â 29% 1852% 30.90 Â 4% sched_debug.cpu.cpu_load[4].avg
2.00 Â 49% 1422% 30.44 Â 5% sched_debug.cpu.cpu_load[3].avg
63060 Â 45% 1053% 726920 Â 32% sched_debug.cpu.load.max
21.25 Â 44% 777% 186.33 Â 7% sched_debug.cpu.nr_uninterruptible.max
14419 Â 21% 731% 119865 Â 31% sched_debug.cpu.load.stddev
3586 381% 17262 sched_debug.cpu.nr_load_updates.min
8286 Â 8% 364% 38414 Â 17% sched_debug.cpu.load.avg
5444 303% 21956 sched_debug.cpu.nr_load_updates.avg
1156 231% 3827 sched_debug.cpu.nr_load_updates.stddev
8603 Â 4% 222% 27662 sched_debug.cpu.nr_load_updates.max
1410 165% 3735 sched_debug.cpu.curr->pid.max
28742 Â 15% 120% 63101 Â 7% sched_debug.cpu.clock.min
28742 Â 15% 120% 63101 Â 7% sched_debug.cpu.clock_task.min
28748 Â 15% 120% 63107 Â 7% sched_debug.cpu.clock.avg
28748 Â 15% 120% 63107 Â 7% sched_debug.cpu.clock_task.avg
28751 Â 15% 120% 63113 Â 7% sched_debug.cpu.clock.max
28751 Â 15% 120% 63113 Â 7% sched_debug.cpu.clock_task.max
442 Â 11% 93% 854 Â 15% sched_debug.cpu.curr->pid.avg
618 Â 3% 72% 1065 Â 4% sched_debug.cpu.curr->pid.stddev
1.88 Â 11% 50% 2.83 Â 8% sched_debug.cpu.clock.stddev
1.88 Â 11% 50% 2.83 Â 8% sched_debug.cpu.clock_task.stddev
5.22 Â 9% -55% 2.34 Â 23% sched_debug.rt_rq:/.rt_time.max
0.85 -55% 0.38 Â 28% sched_debug.rt_rq:/.rt_time.stddev
0.17 -56% 0.07 Â 33% sched_debug.rt_rq:/.rt_time.avg
27633 Â 16% 124% 61980 Â 8% sched_debug.ktime
28745 Â 15% 120% 63102 Â 7% sched_debug.sched_clk
28745 Â 15% 120% 63102 Â 7% sched_debug.cpu_clk

Thanks,
Fengguang