Re: ext2 + -osync: not as easy as it seems

From: Jan Kara
Date: Wed Jan 14 2009 - 09:05:45 EST


On Wed 14-01-09 08:21:46, Theodore Tso wrote:
> On Wed, Jan 14, 2009 at 11:35:33AM +0100, Jan Kara wrote:
> > Yes, I noticed that yesterday as well. But then I was puzzled why ext4
> > would need the flush where it has it... sync_inode() has started and
> > committed a transaction which issued a barrier when the commit was done.
>
> You're right; I'm not convinced we need the flush in ext4 (or ext3) at
> all. We write the data blocks, *then* we call ext4_write_inode(),
> which will force a commit. Now, if we apply that patch which
> optimizes out commits if there are no dirty blocks, then we'll be
> trouble, because we won't know for sure whether or not
> ext4_write_inode() will have forced a journal commit.
>
> If we optimize out the journal commit when there are no blocks
> attached to the transaction, we could change the patch to only force a
> flush if inode->i_state did not have I_DIRTY before the call to
> sync_inode(). Does that sound sane?
Yes. And also add a flush in case of fdatasync().

> > The only reason I could imagine is that barrier (although it is usually
> > translated to flushing writeback caches) actually means just an ordering
> > requirement and hence does not necessarily mean that the caches are
> > properly flushed. Is that so Eric?
>
> I'm not sure what you mean; if the barrier operation isn't flushing
> all of the caches all the way out to the iron oxide, it's not going to
> be working properly no matter where it is being called, whether it's
> in ext4_sync_file() or in jbd2's journal_submit_commit_record().
Well, I thought that a barrier, as an abstraction, only guarantees that
any IO which happened before the barrier hits the iron before any IO which
has been submitted after a barrier. This is actually enough for a
journalling to work correctly but it's not enough for fsync() guarantees.
But I might be wrong...

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/