Re: [RFC][PATCH] Possible data integrity problems in lots offilesystems?

From: Christoph Hellwig
Date: Thu Nov 25 2010 - 07:01:49 EST


On Thu, Nov 25, 2010 at 10:54:57PM +1100, Nick Piggin wrote:
> On Thu, Nov 25, 2010 at 06:49:09PM +1100, Nick Piggin wrote:
> > Second is confusing sync and async inode metadata writeout
> > Core code clears I_DIRTY_SYNC and I_DIRTY_DATASYNC before calling
> > ->write_inode *regardless* of whether it is a for-integrity call or
> > not. This means background writeback can clear it, and subsequent
> > sync_inode_metadata or sync(2) call will skip the next ->write_inode
> > completely.
>
> Hmm, this also means that write_inode_now(sync=1) is buggy. It
> needs to in fact call ->fsync -- which is a file operation
> unfortunately, Christoph didn't you have some patches to move it
> into an inode operation?

No, it doesn't really make much sense either. But what I've slowly
started doing is to phase out write_inode_now. For the cases where
we really only want to write the inode we should use
sync_inode_metadata. That only leaves two others callsers:

- iput_final for a filesystem during unmount. This should be caught
by the need to call ->sync_fs rule you mentioned above, but needs
a closer audit.
- nfsd. Any filesystem that cares should just use the commit_metadata
export operations, which is a subsystem of ->fsync as it only need
to guarantee that metadata is on disk, but not actually any file
data - so no cache flush mess as in a real fsync implementation.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/