Re: [performance regression, bisected] scheduler:should_we_balance() kills filesystem performance

From: Joonsoo Kim
Date: Tue Sep 10 2013 - 02:54:37 EST


On Tue, Sep 10, 2013 at 04:15:20PM +1000, Dave Chinner wrote:
> On Tue, Sep 10, 2013 at 01:47:59PM +0900, Joonsoo Kim wrote:
> > On Tue, Sep 10, 2013 at 02:02:54PM +1000, Dave Chinner wrote:
> > > Hi folks,
> > >
> > > I just updated my performance test VM to the current 3.12-git
> > > tree after the XFS dev branch was merged. The first test I ran
> > > which was a 16-way concurrent fsmark test to create lots of files
> > > gave me a number about 30% lower than I expected - ~180k files/s
> > > when I was expecting somewhere around 250k files/s.
> > >
> > > I did a bisect, and the bisect landed on this commit:
> > >
> > > commit 23f0d2093c789e612185180c468fa09063834e87
> > > Author: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> > > Date: Tue Aug 6 17:36:42 2013 +0900
> > >
> > > sched: Factor out code to should_we_balance()
> .....
> > >
> > > v4 filesystem v5 filesystem
> > > 3.11+xfsdev: 220k files/s 225k files/s
> > > 3.12-git 180k files/s 185k files/s
> > > 3.12-git-revert 245k files/s 247k files/s
> > >
> > > The test vm is a 16p/16GB RAM VM, with a sparse 100TB filesystem
> > > image sitting on a 4-way RAID0 SSD array formatted with XFS and the
> > > image file is accessed by virtio+direct IO. The fsmark command line
> > > is:
> > >
> > > time ./fs_mark -D 10000 -S0 -n 100000 -s 0 -L 32 \
> > > -d /mnt/scratch/0 -d /mnt/scratch/1 \
> > > -d /mnt/scratch/2 -d /mnt/scratch/3 \
> > > -d /mnt/scratch/4 -d /mnt/scratch/5 \
> > > -d /mnt/scratch/6 -d /mnt/scratch/7 \
> > > -d /mnt/scratch/8 -d /mnt/scratch/9 \
> > > -d /mnt/scratch/10 -d /mnt/scratch/11 \
> > > -d /mnt/scratch/12 -d /mnt/scratch/13 \
> > > -d /mnt/scratch/14 -d /mnt/scratch/15 \
> > > | tee >(stats --trim-outliers | tail -1 1>&2)
> > >
> > > The workload on XFS runs to almost being CPU bound - the effect of
> > > the above patch was that there was a lot of idle time left in the
> > > system. The workload consumed the same amount of user and system
> > > CPU, just instantaneous CPU usage was reduced by 20-30% and the
> > > elaspsed time was increased by 20-30%.
> >
> > Hello, Dave.
> >
> > Now, I look again this patch and find one mistake.
> > If we find that we are appropriate cpu for balancing, should_we_balance()
> > should return 1. But current code doesn't do so. This correspond with
> > your observation that a lot of idle time left.
> >
> > Could you re-test your benchmark with below?
>
> Sure. It looks like your patch fixes the problem:
>
> v4 filesystem v5 filesystem
> 3.11+xfsdev: 220k files/s 225k files/s
> 3.12-git 180k files/s 185k files/s
> 3.12-git-revert 245k files/s 247k files/s
> 3.12-git-fix 249k files/s 248k files/s
>
> Thanks for the quick turnaround :)
>
> Tested-by: Dave Chinner <dchinner@xxxxxxxxxx>
>

Thanks for the quick turnaround, too. :)

Hello, Ingo.

I attach the formatted patch with proper SOBs and commit message.
Please merge this to fix above problem.

Thanks.

--------------------->8-------------------------