[LKP] [f2fs] 8b26ef98da3: + 27.4% iostat.sda.wrqm/s

From: Huang Ying
Date: Sun Dec 14 2014 - 22:27:07 EST


FYI, we noticed the below changes on

commit 8b26ef98da3387eb57a8a5c1747c6e628948ee0c ("f2fs: use rw_semaphore for nat entry lock")

testbox/testcase/testparams: lkp-ne04/fsmark/performance-1x-32t-1HDD-f2fs-8K-400M-fsyncBeforeClose-16d-256fpd

4634d71ed190c99e 8b26ef98da3387eb57a8a5c174
---------------- --------------------------
%stddev %change %stddev
\ | \
420 Â 0% +12.0% 470 Â 0% fsmark.files_per_sec
7.37 Â 22% -84.0% 1.18 Â 26% turbostat.%pc6
2122 Â 2% +929.0% 21838 Â 1% proc-vmstat.pgactivate
41341 Â 34% +226.9% 135151 Â 40% sched_debug.cpu#4.sched_count
4093 Â 29% +266.1% 14988 Â 21% sched_debug.cpu#12.ttwu_count
20670219 Â 24% +243.7% 71049994 Â 11% cpuidle.C1-NHM.time
4279 Â 25% +237.2% 14431 Â 19% sched_debug.cpu#14.ttwu_count
3995 Â 19% +237.7% 13492 Â 22% sched_debug.cpu#11.ttwu_count
4092 Â 25% +230.0% 13503 Â 19% sched_debug.cpu#15.ttwu_count
7241 Â 14% +218.7% 23080 Â 18% sched_debug.cpu#3.ttwu_count
4065 Â 28% +251.5% 14291 Â 24% sched_debug.cpu#13.ttwu_count
23 Â 48% +201.1% 69 Â 12% cpuidle.POLL.usage
12604 Â 11% +161.1% 32904 Â 28% sched_debug.cpu#11.nr_switches
5441 Â 15% +164.0% 14365 Â 27% sched_debug.cpu#11.sched_goidle
12902 Â 9% +163.0% 33936 Â 33% sched_debug.cpu#13.nr_switches
8230 Â 13% +182.2% 23230 Â 20% sched_debug.cpu#1.ttwu_count
13010 Â 9% +153.2% 32947 Â 28% sched_debug.cpu#15.nr_switches
5571 Â 11% +160.5% 14511 Â 30% sched_debug.cpu#13.sched_goidle
13596 Â 13% +172.7% 37082 Â 38% sched_debug.cpu#15.sched_count
7563 Â 16% +200.9% 22762 Â 22% sched_debug.cpu#7.ttwu_count
5598 Â 12% +156.2% 14342 Â 26% sched_debug.cpu#15.sched_goidle
16069 Â 23% +117.8% 34992 Â 25% sched_debug.cpu#14.nr_switches
14194 Â 8% +152.8% 35879 Â 26% sched_debug.cpu#12.nr_switches
13397 Â 11% +158.2% 34598 Â 22% sched_debug.cpu#11.sched_count
14596 Â 9% +148.3% 36240 Â 25% sched_debug.cpu#12.sched_count
13647 Â 10% +150.2% 34139 Â 32% sched_debug.cpu#13.sched_count
6705 Â 20% +127.1% 15225 Â 23% sched_debug.cpu#14.sched_goidle
6177 Â 10% +151.7% 15546 Â 24% sched_debug.cpu#12.sched_goidle
16275 Â 23% +139.7% 39015 Â 17% sched_debug.cpu#14.sched_count
6218 Â 15% +209.6% 19252 Â 45% sched_debug.cpu#10.sched_goidle
21820 Â 6% +123.4% 48742 Â 25% sched_debug.cpu#7.nr_switches
22931 Â 10% +159.5% 59497 Â 44% sched_debug.cpu#5.nr_switches
9865 Â 8% +120.0% 21709 Â 24% sched_debug.cpu#7.sched_goidle
10505 Â 12% +141.8% 25405 Â 37% sched_debug.cpu#5.sched_goidle
12980 Â 6% +107.7% 26956 Â 16% sched_debug.cpu#4.ttwu_count
24231 Â 18% +103.6% 49334 Â 24% sched_debug.cpu#3.nr_switches
11147 Â 14% +99.2% 22210 Â 22% sched_debug.cpu#1.sched_goidle
11092 Â 21% +99.0% 22076 Â 23% sched_debug.cpu#3.sched_goidle
29443 Â 8% +89.3% 55744 Â 20% sched_debug.cpu#4.nr_switches
32087 Â 7% +81.3% 58169 Â 18% sched_debug.cpu#2.nr_switches
12984 Â 17% +111.4% 27446 Â 12% sched_debug.cpu#2.ttwu_count
26458 Â 18% +89.7% 50191 Â 24% sched_debug.cpu#1.nr_switches
14505 Â 8% +98.6% 28807 Â 29% sched_debug.cpu#0.sched_goidle
13628 Â 8% +81.1% 24686 Â 17% sched_debug.cpu#2.sched_goidle
13700 Â 9% +82.6% 25012 Â 18% sched_debug.cpu#4.sched_goidle
33822 Â 9% +102.3% 68417 Â 35% sched_debug.cpu#0.nr_switches
18438 Â 28% +160.1% 47957 Â 23% cpuidle.C1-NHM.usage
6.50 Â 10% +73.2% 11.25 Â 7% turbostat.%c1
14 Â 13% +52.5% 22 Â 12% sched_debug.cfs_rq[13]:/.tg_runnable_contrib
135553 Â 6% +73.5% 235188 Â 6% cpuidle.C3-NHM.usage
723 Â 13% +48.3% 1072 Â 10% sched_debug.cfs_rq[13]:/.avg->runnable_avg_sum
28.84 Â 9% +52.2% 43.89 Â 5% turbostat.%c3
63.48 Â 3% -31.8% 43.29 Â 5% turbostat.%c6
30737 Â 0% -31.0% 21223 Â 1% softirqs.BLOCK
2329 Â 5% +31.1% 3052 Â 11% sched_debug.cfs_rq[14]:/.min_vruntime
3.494e+08 Â 12% +48.6% 5.192e+08 Â 5% cpuidle.C3-NHM.time
1.545e+09 Â 2% -27.1% 1.126e+09 Â 2% cpuidle.C6-NHM.time
26451473 Â 5% -28.7% 18850454 Â 17% cpuidle.C1E-NHM.time
304184 Â 6% +36.3% 414743 Â 6% cpuidle.C6-NHM.usage
362 Â 2% +28.7% 466 Â 6% sched_debug.cfs_rq[0]:/.tg->runnable_avg
363 Â 2% +28.6% 467 Â 6% sched_debug.cfs_rq[1]:/.tg->runnable_avg
364 Â 1% +28.4% 467 Â 6% sched_debug.cfs_rq[2]:/.tg->runnable_avg
367 Â 1% +28.0% 470 Â 6% sched_debug.cfs_rq[3]:/.tg->runnable_avg
369 Â 1% +27.9% 472 Â 5% sched_debug.cfs_rq[4]:/.tg->runnable_avg
977486 Â 1% -21.6% 766721 Â 11% sched_debug.cpu#13.avg_idle
372 Â 1% +27.2% 473 Â 6% sched_debug.cfs_rq[5]:/.tg->runnable_avg
373 Â 1% +27.6% 476 Â 6% sched_debug.cfs_rq[6]:/.tg->runnable_avg
379 Â 1% +27.2% 482 Â 6% sched_debug.cfs_rq[8]:/.tg->runnable_avg
376 Â 1% +27.5% 479 Â 5% sched_debug.cfs_rq[7]:/.tg->runnable_avg
381 Â 1% +26.8% 484 Â 6% sched_debug.cfs_rq[9]:/.tg->runnable_avg
41363 Â 5% +59.4% 65923 Â 48% sched_debug.cpu#0.ttwu_count
384 Â 1% +23.1% 473 Â 8% sched_debug.cfs_rq[10]:/.tg->runnable_avg
986988 Â 0% -19.5% 794664 Â 4% sched_debug.cpu#11.avg_idle
386 Â 1% +22.8% 474 Â 8% sched_debug.cfs_rq[11]:/.tg->runnable_avg
389 Â 2% +22.0% 475 Â 8% sched_debug.cfs_rq[13]:/.tg->runnable_avg
392 Â 2% +21.2% 476 Â 8% sched_debug.cfs_rq[14]:/.tg->runnable_avg
388 Â 1% +22.1% 474 Â 8% sched_debug.cfs_rq[12]:/.tg->runnable_avg
396 Â 2% +20.8% 478 Â 7% sched_debug.cfs_rq[15]:/.tg->runnable_avg
940409 Â 1% -12.3% 824690 Â 4% sched_debug.cpu#0.avg_idle
927692 Â 3% -12.9% 807567 Â 5% sched_debug.cpu#2.avg_idle
3216 Â 5% -10.8% 2870 Â 3% proc-vmstat.nr_alloc_batch
979736 Â 0% -13.5% 847782 Â 4% sched_debug.cpu#12.avg_idle
245057 Â 6% -12.1% 215473 Â 11% numa-vmstat.node1.numa_local
1620 Â 4% -11.5% 1435 Â 8% numa-vmstat.node0.nr_alloc_batch
894470 Â 3% -12.4% 783635 Â 7% sched_debug.cpu#7.avg_idle
965398 Â 2% -11.1% 858414 Â 6% sched_debug.cpu#14.avg_idle
167233 Â 0% +239.7% 568014 Â 0% time.voluntary_context_switches
5760 Â 0% +115.2% 12394 Â 1% vmstat.system.cs
7938 Â 2% +86.4% 14800 Â 2% time.involuntary_context_switches
9 Â 7% +72.2% 15 Â 5% time.percent_of_cpu_this_job_got
10.79 Â 4% +52.9% 16.50 Â 4% time.system_time
1.18 Â 2% +33.8% 1.57 Â 3% turbostat.%c0
394 Â 1% +27.4% 502 Â 1% iostat.sda.wrqm/s
17.69 Â 0% -13.3% 15.33 Â 0% iostat.sda.avgqu-sz
5140 Â 1% +14.8% 5900 Â 0% vmstat.io.bo
5183 Â 1% +14.5% 5935 Â 0% iostat.sda.wkB/s
833 Â 0% -10.5% 746 Â 0% iostat.sda.w/s
122 Â 0% -10.4% 109 Â 0% time.elapsed_time
1174 Â 1% +5.4% 1238 Â 1% vmstat.system.in
2.17 Â 1% -4.6% 2.06 Â 1% turbostat.GHz
1280314 Â 0% +2.9% 1317252 Â 0% time.file_system_outputs

lkp-ne04: Nehalem-EP
Memory: 12G




iostat.sda.wrqm/s

600 ++--------------------------------------------------------------------+
| |
500 O+O O O O O O O O O O O O O O O O O O O O O O O O O O |
| |
| |
400 *+*.*..*.*.*.*.*..*.*.*.*.*.*..*.*.*.*.*..*.*.* * *.*.*.*.*..*.*.*
| : : : |
300 ++ : :: : |
| : : : : |
200 ++ : : : : |
| : : : : |
| : : : : |
100 ++ : :: |
| : : |
0 ++----------------------------------------------*----*----------------+


[*] bisect-good sample
[O] bisect-bad sample

To reproduce:

apt-get install ruby ruby-oj
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/setup-local job.yaml # the job file attached in this email
bin/run-local job.yaml


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Fengguang

---
testcase: fsmark
default_monitors:
wait: pre-test
uptime:
iostat:
vmstat:
numa-numastat:
numa-vmstat:
numa-meminfo:
proc-vmstat:
proc-stat:
meminfo:
slabinfo:
interrupts:
lock_stat:
latency_stats:
softirqs:
bdi_dev_mapping:
diskstats:
cpuidle:
cpufreq:
turbostat:
sched_debug:
interval: 10
pmeter:
default_watchdogs:
watch-oom:
watchdog:
cpufreq_governor:
- performance
commit: b6c4cf175369b31552fad86422f1f4d9847b16eb
model: Nehalem-EP
memory: 12G
hdd_partitions: "/dev/disk/by-id/ata-ST3500514NS_9WJ03EBA-part3"
swap_partitions: "/dev/disk/by-id/ata-ST3120026AS_5MS07HA2-part2"
rootfs_partition: "/dev/disk/by-id/ata-ST3500514NS_9WJ03EBA-part1"
iterations: 1x
nr_threads: 32t
disk: 1HDD
fs:
- f2fs
fs2:
-
fsmark:
filesize:
- 8K
test_size: 400M
sync_method: fsyncBeforeClose
nr_directories: 16d
nr_files_per_directory: 256fpd
testbox: lkp-ne04
tbox_group: lkp-ne04
kconfig: x86_64-rhel
enqueue_time: 2014-12-13 00:40:16.264380860 +08:00
head_commit: b6c4cf175369b31552fad86422f1f4d9847b16eb
base_commit: b2776bf7149bddd1f4161f14f79520f17fc1d71d
branch: linux-devel/devel-hourly-2014121201
kernel: "/kernel/x86_64-rhel/b6c4cf175369b31552fad86422f1f4d9847b16eb/vmlinuz-3.18.0-gb6c4cf1"
user: lkp
queue: cyclic
rootfs: debian-x86_64.cgz
result_root: "/result/lkp-ne04/fsmark/performance-1x-32t-1HDD-f2fs-8K-400M-fsyncBeforeClose-16d-256fpd/debian-x86_64.cgz/x86_64-rhel/b6c4cf175369b31552fad86422f1f4d9847b16eb/0"
job_file: "/lkp/scheduled/lkp-ne04/cyclic_fsmark-performance-1x-32t-1HDD-f2fs-8K-400M-fsyncBeforeClose-16d-256fpd-x86_64-rhel-HEAD-b6c4cf175369b31552fad86422f1f4d9847b16eb-0.yaml"
dequeue_time: 2014-12-13 07:56:47.578843098 +08:00
job_state: finished
loadavg: 21.49 9.31 3.47 1/210 5298
start_time: '1418428648'
end_time: '1418428756'
version: "/lkp/lkp/.src-20141212-075301"
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
mkfs -t f2fs /dev/sda3
mount -t f2fs /dev/sda3 /fs/sda3
./fs_mark -d /fs/sda3/1 -d /fs/sda3/2 -d /fs/sda3/3 -d /fs/sda3/4 -d /fs/sda3/5 -d /fs/sda3/6 -d /fs/sda3/7 -d /fs/sda3/8 -d /fs/sda3/9 -d /fs/sda3/10 -d /fs/sda3/11 -d /fs/sda3/12 -d /fs/sda3/13 -d /fs/sda3/14 -d /fs/sda3/15 -d /fs/sda3/16 -d /fs/sda3/17 -d /fs/sda3/18 -d /fs/sda3/19 -d /fs/sda3/20 -d /fs/sda3/21 -d /fs/sda3/22 -d /fs/sda3/23 -d /fs/sda3/24 -d /fs/sda3/25 -d /fs/sda3/26 -d /fs/sda3/27 -d /fs/sda3/28 -d /fs/sda3/29 -d /fs/sda3/30 -d /fs/sda3/31 -d /fs/sda3/32 -D 16 -N 256 -n 1600 -L 1 -S 1 -s 8192
_______________________________________________
LKP mailing list
LKP@xxxxxxxxxxxxxxx