Re: [PATCH] fs: push i_mutex and filemap_write_and_wait down into->fsync() handlers

From: Dave Chinner
Date: Tue Jul 19 2011 - 06:24:32 EST


On Tue, Jun 28, 2011 at 11:35:10AM -0400, Josef Bacik wrote:
> Btrfs needs to be able to control how filemap_write_and_wait_range() is called
> in fsync to make it less of a painful operation, so push down taking i_mutex and
> the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
> file systems can drop taking the i_mutex altogether it seems, like ext3 and
> ocfs2. For correctness sake I just pushed everything down in all cases to make
> sure that we keep the current behavior the same for everybody, and then each
> individual fs maintainer can make up their mind about what to do from there.
> Thanks,
>
> Signed-off-by: Josef Bacik <josef@xxxxxxxxxx>

.....

> diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c
> index 7f782af..9183f75 100644
> --- a/fs/xfs/linux-2.6/xfs_file.c
> +++ b/fs/xfs/linux-2.6/xfs_file.c
> @@ -127,6 +127,8 @@ xfs_iozero(
> STATIC int
> xfs_file_fsync(
> struct file *file,
> + loff_t start,
> + loff_t end,
> int datasync)
> {
> struct inode *inode = file->f_mapping->host;
> @@ -138,8 +140,16 @@ xfs_file_fsync(
>
> trace_xfs_file_fsync(ip);
>
> - if (XFS_FORCED_SHUTDOWN(mp))
> + error = filemap_write_and_wait_range(inode->i_mapping, start, end);
> + if (error)
> + return error;
> +
> + mutex_lock(&inode->i_mutex);
> +
> + if (XFS_FORCED_SHUTDOWN(mp)) {
> + mutex_unlock(&inode->i_mutex);
> return -XFS_ERROR(EIO);
> + }
>
> xfs_iflags_clear(ip, XFS_ITRUNCATED);

Josef, FYI, this causes deadlocks in XFS.

You cannot take the i_mutex in this function as it violates locking
order when called from xfs_file_aio_write() for O_SYNC buffered
IO. We already hold the i_mutex when calling the function, and I
don't think we can drop it without violating sync write atomicity...

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/