Re: [PATCH 2/7] writeback: avoid redirtying when ->write_inodefailed to clear I_DIRTY

From: Jan Kara
Date: Thu Oct 20 2011 - 19:24:16 EST


On Thu 20-10-11 23:22:42, Wu Fengguang wrote:
> From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
>
> Right now ->write_inode has no way to safely return a EAGAIN without explicitly
> redirtying the inode, as we would lose the dirty state otherwise. Most
> filesystems get this wrong, but XFS makes heavy use of it to avoid blocking
> the flusher thread when ->write_inode hits contentended inode locks. A
> contended ilock is something XFS can hit very easibly when extending files, as
> the data I/O completion handler takes the lock to update the size, and the
> ->write_inode call can race with it fairly easily if writing enough data
> in one go so that the completion for the first write come in just before
> we call ->write_inode.
>
> Change the handling of this case to use requeue_io_wait for a quick retry instead
> of redirty_tail, which keeps moving out the dirtied_when data and thus keeps
> delaying the writeout more and more with every failed attempt to get the lock.
>
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
You can add:
Acked-by: Jan Kara <jack@xxxxxxx>

Honza

> ---
> fs/fs-writeback.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> --- linux-next.orig/fs/fs-writeback.c 2011-10-08 13:30:25.000000000 +0800
> +++ linux-next/fs/fs-writeback.c 2011-10-08 13:30:41.000000000 +0800
> @@ -488,8 +488,18 @@ writeback_single_inode(struct inode *ino
> * operations, such as delayed allocation during
> * submission or metadata updates after data IO
> * completion.
> + *
> + * For the latter case it is very important to give
> + * the inode another turn on b_more_io instead of
> + * redirtying it. Constantly moving dirtied_when
> + * forward will prevent us from ever writing out
> + * the metadata dirtied in the I/O completion handler.
> + *
> + * For files on XFS that constantly get appended to
> + * calling redirty_tail means they will never get
> + * their updated i_size written out.
> */
> - redirty_tail(inode, wb);
> + requeue_io_wait(inode, wb);
> } else {
> /*
> * The inode is clean. At this point we either have
>
>
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/