Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

From: Jan Kara
Date: Thu Dec 18 2008 - 18:17:23 EST


Hello,

I'm sorry I'm replying late but I got time to react to this only now...

> On Mon, Nov 03, 2008 at 11:33:18AM -0800, Andrew Morton wrote:
> > [...]
> > > --- a/fs/ext3/super.c
> > > +++ b/fs/ext3/super.c
> > > @@ -2392,7 +2392,13 @@ static int ext3_sync_fs(struct super_block *sb, int wait)
> > > if (journal_start_commit(EXT3_SB(sb)->s_journal, &target)) {
> > > if (wait)
> > > log_wait_commit(EXT3_SB(sb)->s_journal, target);
> > > - }
> > > + } else if (wait)
> > > + /*
> > > + * We may have a commit in progress, clear it out
> > > + * before we go on...
> > > + */
> > > + ext3_force_commit(sb);
> > > +
> > > return 0;
> > > }
> >
> > Can we do
> >
> > sb->s_dirt = 0;
> > if (wait)
> > ext3_force_commit(...);
> > else
> > journal_start_commit(...);
>
> I think this is what you had in mind:
>
>
> diff --git a/fs/ext3/super.c b/fs/ext3/super.c
> index 18eaa78..2b48b85 100644
> --- a/fs/ext3/super.c
> +++ b/fs/ext3/super.c
> @@ -2386,13 +2386,12 @@ static void ext3_write_super (struct super_block * sb)
>
> static int ext3_sync_fs(struct super_block *sb, int wait)
> {
> - tid_t target;
> -
> sb->s_dirt = 0;
> - if (journal_start_commit(EXT3_SB(sb)->s_journal, &target)) {
> - if (wait)
> - log_wait_commit(EXT3_SB(sb)->s_journal, target);
> - }
> + if (wait)
> + ext3_force_commit(sb);
> + else
> + journal_start_commit(EXT3_SB(sb)->s_journal, NULL);
> +
> return 0;
> }
>
> I tried this and it too fixes the problem. FWIW I agree it
> looks better...
Well, shouldn't we rather fix what journal_start_commit() returns?
The interface which returns 1 when a transaction is already committing or
a transaction commit has just been started but 0 when we race with
somebody staring the commit is fairly unusable. Moreover
ext3_force_commit() will unnecessarily create new sync transaction and
commit it if there's no transaction running which is quite expensive
(even merging empty sync handle is not for free because of sync
transaction batching). But this is minor problem since we probably
don't care too much about sync() performance - BTW this is probably a
cause for bug 12224, isn't it?
BTW: ocfs2 would need fixing as well if done your way since it's
sync_fs function has been copied over from ext3.
To summarized I'd rather see a patch like below (untested) going in
and your patch reverted... Opinions? I can cookup a JBD2 version of
the patch in case we agree to go this way.

Honza