Re: XFS: how to NOT null files on fsck?

From: Andrew Morton
Date: Thu Aug 05 2004 - 20:37:37 EST


Nathan Scott <nathans@xxxxxxx> wrote:
>
> On Thu, Aug 05, 2004 at 10:16:02AM +0200, Helge Hafting wrote:
> > L A Walsh wrote:
> > >Now I know it takes a while before data may end up on disk and that it
> > >may not go out to disk in an ordered fashion, but 2-3 days?
> >
> > Seems strange to me, but the amount of delay is entirely up to the
> > filesystem.
>
> The flushing of dirty file data is actually performed by
> kernel threads outside of the individual filesystems.
>
> I cannot explain a 2/3 day wait for data to get flushed,
> something really strange going on for you there.

Well there was a writeback bug which could cause files to not get written
back ever. Perhaps an unmount would cause writeback but nothing else would.

It was fixed by the below patch. The situation will only arise with a
combination of a race and a data-synchronising writeback (O_SYNC, fsync,
etc).

It's unlikely that this is the cause of this report though.




From: Miklos Szeredi <miklos@xxxxxxxxxx>

This patch fixes a hard-to-trigger condition, where the inode is on the
inode_in_use list while it's state is dirty. In this state dirty pages are
not written back in sync() or from kupdate, only from direct page reclaim.
And this causes a livelock in balance_dirty_pages after a while.

The actual sequence of events required to get into this state is:

thread function inode state inode list
----------------------------------------------------------------------------
1 __sync_single_inode (background) I_DIRTY sb->s_io
1 do_writepages ... I_LOCKED
2 __writeback_single_inode (sync) sleeps I_LOCKED
1 __sync_single_inode (background) finish 0 inode_in_use
2 __writeback_single_inode (sync) wakeup 0
2 __sync_single_inode (sync) 0
2 do_writepages ... I_LOCKED
3 __mark_inode_dirty I_LOCKED | I_DIRTY
2 __sync_single_inode (sync) finish I_DIRTY left on
inode_in_use

Signed-off-by: Miklos Szeredi <miklos@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
---

25-akpm/fs/fs-writeback.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletion(-)

diff -puN fs/fs-writeback.c~fix-inode-state-corruption-268-rc1-bk1 fs/fs-writeback.c
--- 25/fs/fs-writeback.c~fix-inode-state-corruption-268-rc1-bk1 Fri Jul 16 15:06:57 2004
+++ 25-akpm/fs/fs-writeback.c Fri Jul 16 15:06:57 2004
@@ -213,8 +213,9 @@ __sync_single_inode(struct inode *inode,
} else if (inode->i_state & I_DIRTY) {
/*
* Someone redirtied the inode while were writing back
- * the pages: nothing to do.
+ * the pages.
*/
+ list_move(&inode->i_list, &sb->s_dirty);
} else if (atomic_read(&inode->i_count)) {
/*
* The inode is clean, inuse
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/