Re: [PATCH] ext4: fix checking on nr_to_write

From: Jan Kara
Date: Mon Oct 14 2013 - 08:59:09 EST


Hello,

On Mon 14-10-13 00:39:52, Ming Lei wrote:
> Commit 4e7ea81db5(ext4: restructure writeback path) introduces
> another performance regression on random write:
>
> - one more page may be mapped to ext4 extent in mpage_prepare_extent_to_map,
> and will be submitted for I/O so nr_to_write will become -1 before 'done'
> is set
>
> - the worse thing is that dirty pages may still be retrieved from page
> cache after nr_to_write becomes negative, so lots of small chunks can be
> submitted to block device when page writeback is catching up with write path,
> and performance is hurted.
Umm, I guess I see what are you pointing at. Thanks for catching that.
mpage_process_page_bufs() always adds a buffer to mpd even if nr_to_write
is already <= 0. But I would somewhat prefer not to call
mpage_prepare_extent_to_map() at all when nr_to_write <= 0. So a patch
like:
ret = mpage_prepare_extent_to_map(&mpd);
if (!ret) {
- if (mpd.map.m_len)
+ if (mpd.map.m_len) {
ret = mpage_map_and_submit_extent(handle, &mpd,
&give_up_on_write);
- else {
+ done = (wbc->nr_to_write <= 0);
+ } else {

Should also fix your problem, am I right?

Honza

> On one arm A15 board(arndale) with sata 3.0 SSD(CPU: 1.5GHz dura core, RAM: 2GB),
> this patch can improve below test result from 157MB/sec to 174MB/sec(>10%):
>
> dd if=/dev/zero of=./z.img bs=8K count=512K
>
> The above test is actually prototype of block write in bonnie++ utility.
>
> This patch fixes check on nr_to_write in mpage_prepare_extent_to_map()
> to make sure nr_to_write won't become negative.
>
> Cc: Ted Tso <tytso@xxxxxxx>
> Cc: Jan Kara <jack@xxxxxxx>
> Cc: linux-ext4@xxxxxxxxxxxxxxx
> Cc: "linux-fsdevel@xxxxxxxxxxxxxxx" <linux-fsdevel@xxxxxxxxxxxxxxx>
> Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxxxxx>
> ---
> fs/ext4/inode.c | 20 ++++++++++----------
> 1 file changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 32c04ab..6a62803 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2356,15 +2356,6 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
> if (mpd->map.m_len == 0)
> mpd->first_page = page->index;
> mpd->next_page = page->index + 1;
> - /* Add all dirty buffers to mpd */
> - lblk = ((ext4_lblk_t)page->index) <<
> - (PAGE_CACHE_SHIFT - blkbits);
> - head = page_buffers(page);
> - err = mpage_process_page_bufs(mpd, head, head, lblk);
> - if (err <= 0)
> - goto out;
> - err = 0;
> -
> /*
> * Accumulated enough dirty pages? This doesn't apply
> * to WB_SYNC_ALL mode. For integrity sync we have to
> @@ -2374,9 +2365,18 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
> * of the old dirty pages.
> */
> if (mpd->wbc->sync_mode == WB_SYNC_NONE &&
> - mpd->next_page - mpd->first_page >=
> + mpd->next_page - mpd->first_page >
> mpd->wbc->nr_to_write)
> goto out;
> +
> + /* Add all dirty buffers to mpd */
> + lblk = ((ext4_lblk_t)page->index) <<
> + (PAGE_CACHE_SHIFT - blkbits);
> + head = page_buffers(page);
> + err = mpage_process_page_bufs(mpd, head, head, lblk);
> + if (err <= 0)
> + goto out;
> + err = 0;
> }
> pagevec_release(&pvec);
> cond_resched();
> --
> 1.7.9.5
>
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/