Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

From: Theodore Tso
Date: Mon Nov 03 2008 - 17:56:18 EST


On Mon, Nov 03, 2008 at 02:27:06PM -0800, Andrew Morton wrote:
> It should clear s_dirt before doing the "i/o", methinks?

Yep, good point. As I mentioned earlier, though, I'm about 99% sure
that the right fix is to remove all mention of s_dirt entirely, and in
fact we can make super_operations.write_super be NULL for ext3 and
ext4. But for now we should just keep it in its usual place for now,
and save that for a cleanup commit later on.

- Ted

commit b20506dc713db1105287b691390563d2aace6d84
Author: Theodore Ts'o <tytso@xxxxxxx>
Date: Mon Nov 3 17:54:41 2008 -0500

ext4: wait on all pending commits in ext4_sync_fs()

In ext4_sync_fs, we only wait for a commit to finish if we started it,
but there may be one already in progress which will not be synced.

In the case of a data=ordered umount with pending long symlinks which
are delayed due to a long list of other I/O on the backing block
device, this causes the buffer associated with the long symlinks to
not be moved to the inode dirty list in the second phase of
fsync_super. Then, before they can be dirtied again, kjournald exits,
seeing the UMOUNT flag and the dirty pages are never written to the
backing block device, causing long symlink corruption and exposing new
or previously freed block data to userspace.

To ensure all commits are synced, we flush all journal commits now
when sync_fs'ing ext4.

Signed-off-by: Arthur Jones <ajones@xxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: "Theodore Ts'o" <tytso@xxxxxxx>
Cc: Eric Sandeen <sandeen@xxxxxxxxxx>
Cc: <linux-ext4@xxxxxxxxxxxxxxx>

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 97cb896..5b5e38e 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2907,12 +2907,9 @@ int ext4_force_commit(struct super_block *sb)
/*
* Ext4 always journals updates to the superblock itself, so we don't
* have to propagate any other updates to the superblock on disk at this
- * point. Just start an async writeback to get the buffers on their way
- * to the disk.
- *
- * This implicitly triggers the writebehind on sync().
+ * point. (We can probably nuke this function altogether, and remove
+ * any mention to sb->s_dirt in all of fs/ext4; eventual cleanup...)
*/
-
static void ext4_write_super(struct super_block *sb)
{
if (mutex_trylock(&sb->s_lock) != 0)
@@ -2922,15 +2919,15 @@ static void ext4_write_super(struct super_block *sb)

static int ext4_sync_fs(struct super_block *sb, int wait)
{
- tid_t target;
+ int ret;

trace_mark(ext4_sync_fs, "dev %s wait %d", sb->s_id, wait);
sb->s_dirt = 0;
- if (jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, &target)) {
- if (wait)
- jbd2_log_wait_commit(EXT4_SB(sb)->s_journal, target);
- }
- return 0;
+ if (wait)
+ ret = ext4_force_commit(sb);
+ else
+ ret = jbd2_journal_start_commit(EXT4_SB(sb)->s_journal, NULL);
+ return ret;
}

/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/