Re: [PATCH 2/4] writeback: 3-queue based writeback schedule

From: Fengguang Wu
Date: Fri Aug 10 2007 - 12:47:39 EST


On Fri, Aug 10, 2007 at 02:34:14PM +0800, Fengguang Wu wrote:
> Properly manage the 3 queues of sb->s_dirty/s_io/s_more_io so that
> - time-ordering of dirtied_when can be easily maintained
> - writeback can continue from where previous run left out
>
> The majority work has been done by Andrew Morton and Ken Chen,
> this patch just clarifies the roles of the 3 queues:
> - s_dirty for io delay(up to dirty_expire_interval)
> - s_io for io run(a full scan of s_io may involve multiple runs)
> - s_more_io for io continuation
>
> The following paradigm shows the data flow.
>
> requeue on new scan(empty s_io)
> +-----------------------------+
> | |
> dirty old | |
> inodes enough V |
> ======> s_dirty ======> s_io |
> ^ | requeue io |
> | +---------------------> s_more_io
> | hold back |
> +----------------+----------> disk write requests
>
> sb->s_dirty: a FIFO queue
> - s_dirty hosts not-yet-expired(recently dirtied) dirty inodes
> - once expired, inodes will be moved out of s_dirty and *never put back*
> (unless for some reason we have to hold on the inode for some time)
>
> sb->s_io and sb->s_more_io: a cyclic queue scanned for io
> - on each run of generic_sync_sb_inodes(), some more s_dirty inodes may be
> appended to s_io
> - on each full scan of s_io, all s_more_io inodes will be moved back to s_io
> - large files that cannot be synced in one run will be moved to s_more_io for
> retry on next full scan

In fact s_more_io is no longer necessary. We end up with a priority
io-delaying queue s_dirty and a cyclic io-syncing queue s_io. They are
properly decoupled. More flexible data structure can be used for
s_dirty, if we want to redirty an inode with arbitrary delays. Also
more priority queues can be introduced in addition to s_dirty. For
example, we can designate a queue s_dirty_atime for inodes dirtied
only by `atime', and sync them lazily.

> inode->dirtied_when
> - inode->dirtied_when is updated to the *current* jiffies on pushing into
> s_dirty, and is never changed in other cases.
> - time-ordering thus can be simply ensured while moving inodes between lists,
> since (time order == enqueue order)
>
> Cc: Ken Chen <kenchen@xxxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Fengguang Wu <wfg@xxxxxxxxxxxxxxxx>
> ---
> fs/fs-writeback.c | 106 +++++++++++++++++++++-----------------------
> 1 file changed, 52 insertions(+), 54 deletions(-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/