Re: [GIT PULL] Ext3 latency fixes

From: Linus Torvalds
Date: Fri Apr 03 2009 - 15:06:06 EST




On Fri, 3 Apr 2009, Linus Torvalds wrote:

>
>
> On Fri, 3 Apr 2009, Theodore Ts'o wrote:
> >
> > Please pull from:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git ext3-latency-fixes
>
> Thanks, pulled. I'll be interested to see how it feels. Will report back
> after I've rebuild and gone through a few more emails.

Hmm.

The "overwrite" behavior may well be better, but it was smooth enough
beforehand too (never having more than ~8MB dirty). The "create big file
and sync" workload causes huge fsync pauses, though. IOW, try with

while :
do
time sh -c "dd if=/dev/zero of=bigfile bs=8M count=256 ; sync"
done

and even really small fsync's end up being at the end of all that
unrelated activity, and you see things like

fsync(7) = 0 <32.756308>

(that was my "switch email folders with update" test case, the full trace
for that file descriptor is

open("/home/torvalds/mail/git-list", O_RDWR) = 7 <0.000010>
fstatfs(7, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=19230104, f_bfree=13853292, f_bavail=12876440, f_files=4890624
flock(7, LOCK_EX) = 0 <0.000009>
fstat(7, {st_mode=S_IFREG|0600, st_size=54231534, ...}) = 0 <0.000005>
lseek(7, 0, SEEK_SET) = 0 <0.000006>
write(7, "From MAILER-DAEMON Fri Apr 3 11:"..., 554) = 554 <0.000012>
lseek(7, 54202529, SEEK_SET) = 54202529 <0.000007>
read(7, "From torvalds@xxxxxxxxxxxxxxxxxxx"..., 66) = 66 <0.000008>
lseek(7, 54202595, SEEK_SET) = 54202595 <0.000006>
read(7, "Return-Path: <git-owner@xxxxxxxxx"..., 2915) = 2915 <0.000007>
lseek(7, 54202529, SEEK_SET) = 54202529 <0.000005>
write(7, "From torvalds@xxxxxxxxxxxxxxxxxxx"..., 2981) = 2981 <0.000009>
ftruncate(7, 54231534) = 0 <0.000008>
fsync(7) = 0 <32.756308>
close(7) = 0 <0.000006>

so it had done just a few kB of writes, but because it ended up behind
the humongous backlog of 'bigfile' it didn't much help.

Also, it's maybe worth noting that you don't actually need a 2GB file to
trigger this behavior. Change that "count=256" into a "count=16", and you
now have a simulation of just writing 128MB at a time, with a "sync" in
between to make sure it hits the disk. It makes the pauses smaller, but
they are still several seconds.

(That, btw, is probably more the kind of thing I see when doign a "yum
update". I assume a package manager would do exactly that kind of "unpack
files and sync" in a loop).

Btw, I assume this same thing holds true for ext4 too? Because it shows
how two different "sync" operations interact, and one kills the
performance of the other one. So as long as there is a _single_ fsync()
user, you're fine. It's when you get more than one...

Again, I have that Intel SSD that should do pretty reliably 40+MB/s even
with really nasty write patterns, so I do need several hundred megs to
really see painful pauses. On a slower disk you'd need much less).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/