Re: Syncing a file's metadata in a portable way

From: Andrew Morton
Date: Sun Jul 11 2004 - 05:37:51 EST


bert hubert <ahu@xxxxxxx> wrote:
>
> On Sat, Jul 10, 2004 at 01:14:59PM -0700, Andrew Morton wrote:
>
> > If only the one file has been written to, an fsync on ext3 shouldn't
> > produce any more writeout than an fsync on ext2.
> (...)
> > Either that, or SQLite is broken.
>
> I'll show strace and vmstat tomorrow - I found very little writes, no mmap,
> some fsync and massive writeouts. On ext2, performance was two orders of
> magnitude better.
>

One scenario which could cause this is if the application is writing a
large amount of data to a file and is repeatedly *overwriting* that data.
And the application is repeatedly adding new blocks to, and fsyncing a
separate file.

strace might tell us that, if the traces are skilfully captured and studied.

You should try data=writeback. Given that the app is using fsync() for its
own data integrity purposes anyway, you don't need data=ordered.

It's strange though. databases often preallocate the file space, so a
regular write won't add new blocks to the file and won't allocate any new
metadata. In this situation, an fsync() will only force a commit once per
second, when the inode mtime changes.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/