Re: performance changes on d4b4c2cd: 37.6% fsmark.files_per_sec, -15.9% fsmark.files_per_sec, and few more

From: NeilBrown
Date: Tue Mar 24 2015 - 23:05:25 EST


On Wed, 18 Mar 2015 13:03:19 +0800 Yuanahn Liu <yuanhan.liu@xxxxxxxxxxxxxxx>
wrote:

> Hi,
>
> FYI, we noticed performance changes on `fsmark.files_per_sec' by d4b4c2cdffab86f5c7594c44635286a6d277d5c6:
>
> > commit d4b4c2cdffab86f5c7594c44635286a6d277d5c6
> > Author: shli@xxxxxxxxxx <shli@xxxxxxxxxx>
> > AuthorDate: Mon Dec 15 12:57:03 2014 +1100
> > Commit: NeilBrown <neilb@xxxxxxx>
> > CommitDate: Wed Mar 4 13:40:17 2015 +1100
> >
> > RAID5: batch adjacent full stripe write

Thanks a lot for this one too!
Generally positive, with the only regressions on NoSync tests. Maybe the
same cause?

Again,
> 7 Â 5% +37.6% 10 Â 6% fsmark.time.percent_of_cpu_this_job_got
and
> 9 Â 0% -14.8% 7 Â 6% fsmark.time.percent_of_cpu_this_job_got

are a bit confusing - really less than 10% of a CPU ??

Thanks,
NeilBrown


>
> c1dfe87e41d9c2926fe92f803f02c733ddbccf0b d4b4c2cdffab86f5c7594c44635286a6d277d5c6
> ---------------------------------------- ----------------------------------------
> run time(m) metric_value Âstddev run time(m) metric_value Âstddev change testbox/benchmark/sub-testcase
> --- ------ ---------------------------- --- ------ ---------------------------- -------- ------------------------------
> 4 15.3 33.525 Â3.0% 6 11.1 46.133 Â5.0% 37.6% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
> 3 0.5 262.800 Â1.5% 3 0.4 307.367 Â1.2% 17.0% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
> 3 0.5 289.900 Â0.3% 3 0.4 323.367 Â2.4% 11.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-NoSync
> 3 0.5 325.667 Â2.2% 3 0.5 358.800 Â1.8% 10.2% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-ext4-4M-30G-NoSync
> 3 0.6 216.100 Â0.4% 3 0.6 230.100 Â0.4% 6.5% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
> 3 0.5 309.900 Â0.3% 3 0.5 328.500 Â1.1% 6.0% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-xfs-4M-30G-NoSync
>
> 3 13.8 37.000 Â0.2% 3 16.5 31.100 Â0.3% -15.9% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync
>
> NOTE: here are some more info about those test parameters for you to
> understand the testcase better:
>
> 1x : where 'x' means iterations or loop, corresponding to the 'L' option of fsmark
> 64t: where 't' means thread
> 4M : means the single file size, corresponding to the '-s' option of fsmark
> 120G, 30G: means the total test size
>
> 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means
> the size of one ramdisk. So, it would be 48G in total. And we made a
> raid on those ramdisk.
>
>
> And FYI, here I listed more detailed changes for the maximal postive and negtive changes.
>
>
> more detailed changes about ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
> ---------
>
> c1dfe87e41d9c292 d4b4c2cdffab86f5c7594c4463
> ---------------- --------------------------
> %stddev %change %stddev
> \ | \
> 33.53 Â 3% +37.6% 46.13 Â 4% fsmark.files_per_sec
> 916 Â 3% -27.2% 667 Â 5% fsmark.time.elapsed_time.max
> 916 Â 3% -27.2% 667 Â 5% fsmark.time.elapsed_time
> 7 Â 5% +37.6% 10 Â 6% fsmark.time.percent_of_cpu_this_job_got
> 92097 Â 2% -23.1% 70865 Â 4% fsmark.time.voluntary_context_switches
> 0.04 Â 42% +681.0% 0.27 Â 22% turbostat.Pkg%pc3
> 716062 Â 3% -82.7% 124210 Â 21% cpuidle.C1-IVT.usage
> 6.883e+08 Â 2% -86.8% 91146705 Â 34% cpuidle.C1-IVT.time
> 0.04 Â 30% +145.8% 0.10 Â 25% turbostat.CPU%c3
> 404 Â 16% -58.4% 168 Â 14% cpuidle.POLL.usage
> 159 Â 47% +179.5% 444 Â 23% proc-vmstat.kswapd_low_wmark_hit_quickly
> 11133 Â 23% +100.3% 22298 Â 30% cpuidle.C3-IVT.usage
> 10286681 Â 27% +95.6% 20116924 Â 27% cpuidle.C3-IVT.time
> 7.92 Â 16% +77.4% 14.05 Â 6% turbostat.Pkg%pc6
> 4.93 Â 3% -38.6% 3.03 Â 2% turbostat.CPU%c1
> 916 Â 3% -27.2% 667 Â 5% time.elapsed_time.max
> 916 Â 3% -27.2% 667 Â 5% time.elapsed_time
> 2137390 Â 3% -26.7% 1566752 Â 5% proc-vmstat.pgfault
> 7 Â 5% +37.6% 10 Â 6% time.percent_of_cpu_this_job_got
> 4.309e+10 Â 3% -26.3% 3.176e+10 Â 5% cpuidle.C6-IVT.time
> 49038 Â 2% -23.9% 37334 Â 4% uptime.idle
> 1047 Â 2% -23.8% 797 Â 4% uptime.boot
> 92097 Â 2% -23.1% 70865 Â 4% time.voluntary_context_switches
> 4005888 Â 0% +13.3% 4537685 Â 11% meminfo.DirectMap2M
> 3917 Â 2% -16.3% 3278 Â 5% proc-vmstat.pageoutrun
> 213737 Â 1% -13.9% 183969 Â 3% softirqs.SCHED
> 46.86 Â 1% +16.5% 54.59 Â 1% turbostat.Pkg%pc2
> 32603 Â 3% -11.7% 28781 Â 5% numa-vmstat.node1.nr_unevictable
> 130415 Â 3% -11.7% 115127 Â 5% numa-meminfo.node1.Unevictable
> 256781 Â 2% -8.8% 234146 Â 3% softirqs.TASKLET
> 253606 Â 2% -8.9% 231108 Â 3% softirqs.BLOCK
> 119.10 Â 2% -70.0% 35.78 Â 13% iostat.sdc.rrqm/s
> 119.86 Â 1% -70.3% 35.64 Â 12% iostat.sdb.rrqm/s
> 117.13 Â 2% -70.2% 34.96 Â 11% iostat.sda.rrqm/s
> 504 Â 2% -67.6% 163 Â 12% iostat.sdc.rkB/s
> 507 Â 1% -67.9% 163 Â 12% iostat.sdb.rkB/s
> 496 Â 2% -67.7% 160 Â 11% iostat.sda.rkB/s
> 15392 Â 3% +37.8% 21203 Â 5% iostat.sdb.wrqm/s
> 15393 Â 3% +37.7% 21203 Â 5% iostat.sdc.wrqm/s
> 15392 Â 3% +37.7% 21203 Â 5% iostat.sda.wrqm/s
> 125236 Â 3% +37.7% 172422 Â 4% vmstat.io.bo
> 125181 Â 3% +37.6% 172303 Â 4% iostat.md0.wkB/s
> 552 Â 3% +37.6% 760 Â 4% iostat.md0.w/s
> 62611 Â 3% +37.6% 86167 Â 4% iostat.sdb.wkB/s
> 62613 Â 3% +37.6% 86167 Â 4% iostat.sdc.wkB/s
> 62613 Â 3% +37.6% 86168 Â 4% iostat.sda.wkB/s
> 40.24 Â 1% -18.5% 32.81 Â 2% turbostat.CorWatt
> 200 Â 0% +22.2% 245 Â 2% iostat.sdc.w/s
> 1020 Â 2% +21.7% 1242 Â 2% vmstat.system.in
> 200 Â 0% +22.1% 245 Â 2% iostat.sda.w/s
> 200 Â 0% +22.2% 245 Â 2% iostat.sdb.w/s
> 69.99 Â 0% -12.4% 61.34 Â 2% turbostat.PkgWatt
> 3943 Â 2% -8.9% 3593 Â 1% vmstat.system.cs
> 1.51 Â 1% +6.1% 1.60 Â 2% iostat.sdb.avgqu-sz
> 3.21 Â 0% +5.4% 3.39 Â 1% turbostat.RAMWatt
> 256182 Â 1% -4.2% 245424 Â 1% iostat.md0.avgqu-sz
>
>
>
> more detailed changes about ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync
> ---------
>
> c1dfe87e41d9c292 d4b4c2cdffab86f5c7594c4463
> ---------------- --------------------------
> %stddev %change %stddev
> \ | \
> 37.00 Â 0% -15.9% 31.10 Â 0% fsmark.files_per_sec
> 63414 Â 4% +57.6% 99945 Â 1% fsmark.time.voluntary_context_switches
> 830 Â 0% +18.8% 987 Â 0% fsmark.time.elapsed_time
> 830 Â 0% +18.8% 987 Â 0% fsmark.time.elapsed_time.max
> 9 Â 0% -14.8% 7 Â 6% fsmark.time.percent_of_cpu_this_job_got
> 1.48 Â 20% +357.3% 6.75 Â 5% turbostat.Pkg%pc6
> 63414 Â 4% +57.6% 99945 Â 1% time.voluntary_context_switches
> 109 Â 15% -37.8% 68 Â 20% time.involuntary_context_switches
> 338 Â 17% +57.6% 533 Â 0% cpuidle.POLL.usage
> 2691 Â 1% -20.3% 2144 Â 12% proc-vmstat.kswapd_high_wmark_hit_quickly
> 1060792 Â 0% +20.2% 1275544 Â 0% cpuidle.C6-IVT.usage
> 3.876e+10 Â 0% +19.3% 4.625e+10 Â 0% cpuidle.C6-IVT.time
> 830 Â 0% +18.8% 987 Â 0% time.elapsed_time.max
> 830 Â 0% +18.8% 987 Â 0% time.elapsed_time
> 39984 Â 0% +18.6% 47434 Â 0% uptime.idle
> 856 Â 0% +18.4% 1014 Â 0% uptime.boot
> 15874 Â 12% +20.9% 19188 Â 6% slabinfo.anon_vma.active_objs
> 1942445 Â 0% +18.1% 2293524 Â 0% proc-vmstat.pgfault
> 15977 Â 12% +20.1% 19188 Â 6% slabinfo.anon_vma.num_objs
> 110388 Â 9% +13.0% 124724 Â 4% meminfo.DirectMap4k
> 3107 Â 8% -20.9% 2459 Â 15% numa-meminfo.node0.AnonHugePages
> 18408 Â 11% +15.0% 21165 Â 3% slabinfo.free_nid.active_objs
> 18880 Â 11% +13.7% 21465 Â 4% slabinfo.free_nid.num_objs
> 1125535 Â 0% -11.5% 996605 Â 1% cpuidle.C1-IVT.usage
> 9 Â 0% -14.8% 7 Â 6% time.percent_of_cpu_this_job_got
> 198260 Â 1% +11.7% 221366 Â 0% softirqs.SCHED
> 6.09 Â 2% -12.2% 5.34 Â 0% turbostat.CPU%c1
> 14203 Â 2% -13.1% 12346 Â 8% slabinfo.kmalloc-256.num_objs
> 13763 Â 3% -13.3% 11937 Â 9% slabinfo.kmalloc-256.active_objs
> 1255 Â 6% +10.1% 1383 Â 1% slabinfo.RAW.num_objs
> 1255 Â 6% +10.1% 1383 Â 1% slabinfo.RAW.active_objs
> 30.37 Â 3% +30.5% 39.62 Â 0% iostat.sdc.rrqm/s
> 31.23 Â 5% +28.0% 39.98 Â 1% iostat.sdb.rrqm/s
> 33.37 Â 3% +19.0% 39.72 Â 2% iostat.sda.rrqm/s
> 562 Â 0% -15.9% 472 Â 0% iostat.md0.w/s
> 17106 Â 0% -15.9% 14382 Â 0% iostat.sda.wrqm/s
> 17106 Â 0% -15.9% 14382 Â 0% iostat.sdc.wrqm/s
> 17106 Â 0% -15.9% 14382 Â 0% iostat.sdb.wrqm/s
> 69317 Â 0% -15.9% 58284 Â 0% iostat.sdc.wkB/s
> 69316 Â 0% -15.9% 58284 Â 0% iostat.sda.wkB/s
> 69317 Â 0% -15.9% 58284 Â 0% iostat.sdb.wkB/s
> 138603 Â 0% -15.9% 116543 Â 0% iostat.md0.wkB/s
> 138705 Â 0% -15.9% 116633 Â 0% vmstat.io.bo
> 213 Â 0% -14.5% 182 Â 0% iostat.sdb.w/s
> 213 Â 0% -14.5% 182 Â 0% iostat.sda.w/s
> 213 Â 0% -14.6% 182 Â 0% iostat.sdc.w/s
> 4731 Â 0% -12.7% 4131 Â 0% vmstat.system.cs
> 1133 Â 2% -12.3% 993 Â 0% vmstat.system.in
> 3.02 Â 3% -8.6% 2.76 Â 3% iostat.sdc.avgqu-sz
> 3.29 Â 2% -9.4% 2.98 Â 3% iostat.sdb.avgqu-sz
> 25 Â 19% -21.3% 19 Â 2% turbostat.Avg_MHz
> 3.10 Â 1% -9.4% 2.81 Â 1% iostat.sda.avgqu-sz
> 44.45 Â 1% -5.6% 41.94 Â 2% turbostat.CorWatt
> 0.75 Â 19% -20.1% 0.60 Â 4% turbostat.%Busy
> 74.92 Â 1% -4.9% 71.23 Â 2% turbostat.PkgWatt

Attachment: pgpBPGNUTIxYT.pgp
Description: OpenPGP digital signature