Re: o_sync in vfat driver

From: col-pepper
Date: Thu Mar 30 2006 - 12:35:56 EST


On Wed, 29 Mar 2006 04:13:03 +0200, Mathis Ahrens <Mathis.Ahrens@xxxxxx> wrote:

Hi all,

Chris Mason wrote:
On Monday 27 February 2006 18:12, Andrew Morton wrote:

We don't know that the same number of same-sized write()s were happening in
each case.

There's been some talk about implementing fsync()-on-file-close for this
problem, and some protopatches. But nothing final yet.


Here's the patch I'm using in -suse right now. What I want to do is make a much more generic -o flush, but it'll still need a few bits in individual filesystem to kick off metadata writes quickly.

The basic goal behind the code is to trigger writes without waiting for both
data and metadata. If the user is watching the memory stick, when the little light stops flashing all the data and metadata will be on disk.

It also generally throttles userland a little during file release. This could be changed to throttle for each page dirtied, but most users I asked liked the current setup better.


I like the idea and would like to see something like this in mainline.

Here is some non-scientific benchmark done with 2.6.16, comparing
default mount and flush mount of a USB2 stick:

/////////////////////////////////////////////////////////////////////
Single File "Test": 43MB
$ time cp Test /media/usbdisk/test/ && time umount /media/usbdisk/
/////////////////////////////////////////////////////////////////////

VANILLA:

real 0m3.770s
user 0m0.004s
sys 0m0.308s

real 0m9.439s
user 0m0.000s
sys 0m0.040s

FLUSH:

real 0m6.000s
user 0m0.012s
sys 0m0.400s

real 0m3.668s
user 0m0.000s
sys 0m0.028s

REAL TIME RATIO (FLUSH/VANILLA):
9.6 / 13.1 = 0.73

/////////////////////////////////////////////////////////////////////
Directory Tree "flushtest": 44MB (8866 files, 1820 dirs)
$ time cp -R flushtest/ /media/usbdisk/ && time umount /media/usbdisk/
/////////////////////////////////////////////////////////////////////

VANILLA:

real 0m0.966s
user 0m0.024s
sys 0m0.860s

real 1m11.962s
user 0m0.004s
sys 0m0.160s

FLUSH:

real 1m41.645s
user 0m0.032s
sys 0m1.112s

real 0m4.660s
user 0m0.004s
sys 0m0.068s

REAL TIME RATIO (FLUSH/VANILLA):
106.3 / 77.9 = 1.36



That's interesting, albeit non-scientific, I think it is quite informative.

There are two basic problems with the current code: speed is down by around and order of magnitude compared to a non-synced write and the fact that the code is hammering the FAT. The two are obviously related.

Viewing the system globally rather than considering the details of the techniques used, it would seem that any algorithm that does not drastically reduce write times, at least on the one large file test , is missing the mark and presumably repeating the problem in a slightly different way.

Not knocking the efforts Chris has put in , it's great to see this is getting some attention, but I think viewing overall performance times as shown above gives a touchstone as to whether any particular proto is effective.

The fact that flush can be almost 40% slower in some cases is worrying.

Thanks for the info.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/