Re: [patch 6/7] fs: fsync optimisations

From: Nick Piggin
Date: Mon Nov 29 2010 - 19:11:15 EST


On Mon, Nov 29, 2010 at 10:03:25AM -0500, Christoph Hellwig wrote:
> On Wed, Nov 24, 2010 at 01:06:16AM +1100, npiggin@xxxxxxxxx wrote:
> > Optimise fsync by adding a datasync parameter to sync_inode_metadata to
> > DTRT with writing back the inode (->write_inode in theory should have a
> > datasync parameter too perhaps, but that's for another time).
> >
> > Also, implement the metadata sync optimally rather than reusing the
> > normal data writeback path. This means less useless moving the inode around the
> > writeback lists, and less dropping and retaking of inode_lock, and avoiding
> > the data writeback call with nr_pages == 0.
>
> This patch looks good to me, but a few minor comments below:

(BTW. it actually had a bug with writeback_end not being called in some
cases, in case you test it)

> > @@ -1315,13 +1315,49 @@ EXPORT_SYMBOL(sync_inode);
> > *
> > * Note: only writes the actual inode, no associated data or other metadata.
> > */
> > -int sync_inode_metadata(struct inode *inode, int wait)
> > +int sync_inode_metadata(struct inode *inode, int datasync, int wait)
> > {
> > + struct address_space *mapping = inode->i_mapping;
> > struct writeback_control wbc = {
> > .sync_mode = wait ? WB_SYNC_ALL : WB_SYNC_NONE,
> > .nr_to_write = 0, /* metadata-only */
> > };
> > + unsigned dirty, mask;
> > + int ret = 0;
> >
> > - return sync_inode(inode, &wbc);
> > + /*
> > + * This is a similar implementation to writeback_single_inode.
> > + * Keep them in sync.
> > + */
>
> I'd move this comment to above the function.

OK

>
> > + /*
> > + * Generic write_inode doesn't distinguish between sync and datasync,
> > + * so even a datasync can clear the sync state. Filesystems which
> > + * distiguish these cases must only clear 'mask' in their metadata
> > + * sync code.
> > + */
>
> I don't understand this comment. Currenly filesystems never clear
> i_state bits by themselves.

No, but with the new inode writeback routines they could. Basically
they might be expected to copy this function, and put their own
improvements in.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/