Re: fsync(2) weirdness

Gordon Oliver (gordo@telsur.cl)
Thu, 2 Oct 1997 20:50:09 -0400 (CST)


...Scott Laird says...:
> Can someone explain this ('strace -r -T' output, showing relative
> times on the left and total time spent in the syscall in brackets on
> the right):
>
> 0.000000 lseek(4, 8192, SEEK_SET) = 8192 <0.000347>
> 0.030219 write(4, "\10\2\23\0 \0\0\2"..., 8192) = 8192 <0.019031>
> 0.019484 fsync(4) = 0 <100.531970>
> 100.532680 lseek(4, 1420918784, SEEK_SET) = 1420918784 <0.000105>
> 0.000325 read(4, "\6\0\23\0\0\346\01644"..., 8192) = 8192 <0.014335>
> 0.014785 lseek(4, 1421033472, SEEK_SET) = 1421033472 <0.000093>
> 0.000298 write(4, "\4\2\23\0\0$\1\0\377"..., 8192) = 8192 <0.000336>
> 0.000633 fsync(4) = 0 <103.124272>

you didn't say what fs it is on... prolly ext2?

The problem would appear to be a combination of file size and sync behaviour
(assuming that this is ext2). On any file, ext2 makes precisely two passes
over the _entire_ file looking for buffers that are out of sync, and sync'ing
them. In this case, if the file is ~1.4 G long, it will have to look up every
buffer in the file, and then check those present to see if they are up to date
Unfortunately, if most of them are not in the system, the search time is
worse...

I sent a patch to buffer.c to the list at some point that chopped about
10% off the buffer search time (or I thought I sent it to the list). It
makes the buffer hash table larger and changes the hashing function...
But this will cut something less than 10% of the time off of fsync...

... could you turn on kernel profiling and send the top few functions
while this is running. Unless it is doing a lot of I/O the 100 seconds
still seems way long... (hmm. 1.4G/512= ~2.8M... 2.8 million buffer
lookups... ouch) Is that file full of holes? or is it completely full
of data?

If you have access to the source try changing the way it does things.
you can:
1) use O_SYNC on the file, which does what you need.
2) use sync() to sync the entire disk... which will mostly
do what you want, but on busy systems will do a whole lot
more... since it checks buffers in memory, it may actually
take quite a lot less time...

anybody else have better ideas... (flame me if I'm totally off base)

---------------------------------------------------------------
Gordon Oliver (gordo@telsur.cl) Independent Consultant
... Available for consulting on Linux ...