Re: [PATCH] reiserfs: don't drop PG_dirty when releasingsub-page-sized dirty file

From: Peter Zijlstra
Date: Tue Oct 23 2007 - 06:07:56 EST


[ adding reiserfs devs to the CC ]

On Tue, 2007-10-23 at 15:55 +0800, Fengguang Wu wrote:
> This is not a new problem in 2.6.23-git17.
> 2.6.22/2.6.23 is buggy in the same way.
>
> Reiserfs could leave newly created sub-page-size files in dirty state
> for ever. They cannot be synced to disk by pdflush routines or
> explicit `sync' commands. Only `umount' can do the trick.
>
> The direct cause is: the dirty page's PG_dirty is wrongly _cleared_.
> Call trace:
> [<ffffffff8027e920>] cancel_dirty_page+0xd0/0xf0
> [<ffffffff8816d470>] :reiserfs:reiserfs_cut_from_item+0x660/0x710
> [<ffffffff8816d791>] :reiserfs:reiserfs_do_truncate+0x271/0x530
> [<ffffffff8815872d>] :reiserfs:reiserfs_truncate_file+0xfd/0x3b0
> [<ffffffff8815d3d0>] :reiserfs:reiserfs_file_release+0x1e0/0x340
> [<ffffffff802a187c>] __fput+0xcc/0x1b0
> [<ffffffff802a1ba6>] fput+0x16/0x20
> [<ffffffff8029e676>] filp_close+0x56/0x90
> [<ffffffff8029fe0d>] sys_close+0xad/0x110
> [<ffffffff8020c41e>] system_call+0x7e/0x83
>
> Fix the bug by removing the cancel_dirty_page() call. Tests show that
> it causes no bad behaviors on various write sizes.
>
>
> === for the patient ===
> Here are more detailed demonstrations of the problem.
>
> 1) the page has both PG_dirty(D)/PAGECACHE_TAG_DIRTY(d) after being written to;
> and then only PAGECACHE_TAG_DIRTY(d) remains after the file is closed.
>
> ------------------------------ screen 0 ------------------------------
> [T0] root /home/wfg# cat > /test/tiny
> [T1] hi
> [T2] root /home/wfg#
>
> ------------------------------ screen 1 ------------------------------
> [T1] root /home/wfg# echo /test/tiny > /proc/filecache
> [T1] root /home/wfg# cat /proc/filecache
> # file /test/tiny
> # flags R:referenced A:active M:mmap U:uptodate D:dirty W:writeback O:owner B:buffer d:dirty w:writeback
> # idx len state refcnt
> 0 1 ___UD__Bd_ 2
> [T2] root /home/wfg# cat /proc/filecache
> # file /test/tiny
> # flags R:referenced A:active M:mmap U:uptodate D:dirty W:writeback O:owner B:buffer d:dirty w:writeback
> # idx len state refcnt
> 0 1 ___U___Bd_ 2
>
> 2) note the non-zero 'cancelled_write_bytes' after /tmp/hi is copied.
>
> ------------------------------ screen 0 ------------------------------
> [T0] root /home/wfg# echo hi > /tmp/hi
> [T1] root /home/wfg# cp /tmp/hi /dev/stdin /test
> [T2] hi
> [T3] root /home/wfg#
>
> ------------------------------ screen 1 ------------------------------
> [T1] root /proc/4397# cd /proc/`pidof cp`
> [T1] root /proc/4713# cat io
> rchar: 8396
> wchar: 3
> syscr: 20
> syscw: 1
> read_bytes: 0
> write_bytes: 20480
> cancelled_write_bytes: 4096
> [T2] root /proc/4713# cat io
> rchar: 8399
> wchar: 6
> syscr: 21
> syscw: 2
> read_bytes: 0
> write_bytes: 24576
> cancelled_write_bytes: 4096
>
> //Question: the 'write_bytes' is a bit more than expected ;-)
>
> Cc: Maxim Levitsky <maximlevitsky@xxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Signed-off-by: Fengguang Wu <wfg@xxxxxxxxxxxxxxxx>
> ---
> fs/reiserfs/stree.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> --- linux-2.6.24-git17.orig/fs/reiserfs/stree.c
> +++ linux-2.6.24-git17/fs/reiserfs/stree.c
> @@ -1458,9 +1458,6 @@ static void unmap_buffers(struct page *p
> }
> bh = next;
> } while (bh != head);
> - if (PAGE_SIZE == bh->b_size) {
> - cancel_dirty_page(page, PAGE_CACHE_SIZE);
> - }
> }
> }
> }
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/