Re: [PATCH] f2fs: Fix recover when nid of non-inode dnode < nid of inode

From: Jaegeuk Kim
Date: Wed Sep 10 2014 - 03:21:43 EST


On Tue, Sep 09, 2014 at 07:31:49PM +0800, huang ying wrote:
> On Tue, Sep 9, 2014 at 3:09 PM, Jaegeuk Kim <jaegeuk@xxxxxxxxxx> wrote:
>
> > Hi,
> >
> > On Tue, Sep 09, 2014 at 01:39:30PM +0800, Huang Ying wrote:
> > > On Mon, 2014-09-08 at 22:23 -0700, Jaegeuk Kim wrote:
> > > > Hi Huang,
> > > >
> > > > On Mon, Sep 08, 2014 at 07:38:26PM +0800, Huang Ying wrote:
> > > > > For fsync, if the nid of a non-inode dnode < nid of inode and the
> > > > > inode is not checkpointed. The non-inode dnode may be written before
> > > > > inode. So in find_fsync_dnodes, f2fs_iget will fail, cause the
> > > > > recovery fail.
> > > > >
> > > > > Usually, inode will be allocated before non-inode dnode, so the nid
> > of
> > > > > inode < nid of non-inode dnode. But it is possible for the reverse.
> > > > > For example, because of alloc_nid_failed.
> > > > >
> > > > > This is fixed via ignoring non-inode dnode before inode dnode in
> > > > > find_fsync_dnodes.
> > > > >
> > > > > The patch was tested via allocating nid reversely via a debugging
> > > > > patch, that is, from big number to small number.
> > > > >
> > > > > Signed-off-by: Huang, Ying <ying.huang@xxxxxxxxx>
> > > > > ---
> > > > > fs/f2fs/recovery.c | 7 ++++---
> > > > > 1 file changed, 4 insertions(+), 3 deletions(-)
> > > > >
> > > > > --- a/fs/f2fs/recovery.c
> > > > > +++ b/fs/f2fs/recovery.c
> > > > > @@ -172,8 +172,8 @@ static int find_fsync_dnodes(struct f2fs
> > > > > if (IS_INODE(page) && is_dent_dnode(page))
> > > > > set_inode_flag(F2FS_I(entry->inode),
> > > > > FI_INC_LINK);
> > > > > - } else {
> > > > > - if (IS_INODE(page) && is_dent_dnode(page)) {
> > > >
> > > > If this is not inode block, we should add this inode to recover its
> > data blocks.
> > >
> > > Is it possible that there is only non-inode dnode but no inode when
> > > find_fsync_dnodes checking dnodes? Per my understanding, any changes to
> > > file will cause inode page dirty (for example, mtime changed), so that
> > > we will write inode block. Is it right? If so, the solution in this
> > > patch should work too.
> >
> > Your description says that f2fs_iget will fail, which causes the recovery
> > fail.
> > So, I thought it would be better to handle the f2fs_iget failure directly.
> >
>
> Yes. That is another way to fix the issue.
>
>
> > In addition, we cannot guarantee the write order of dnode and inode.
> > For exmaple,
> > 1. the inode is written by flusher or kswapd, then,
> > 2. f2fs_sync_file writes its dnode.
> >
> > In that case, we can get only non-inode dnode in the node chain, since the
> > inode
> > has not fsync_mark.
> >
>
> I think your solution is better here, but does not fix all scenarios. If
> the inode is checkpointed, the file can be recovered, although the inode
> information may be not up to date. But if the inode is not checkpointed,
> f2fs_iget will fail too and recover will fail.

Ok, let me consider your scenarios.

Term: F: fsync_mark, D: dentry_mark

1. inode(x) | CP | inode(x) | dnode(F)
-> Lose the latest inode(x). Need to fix.

2. inode(x) | CP | dnode(F) | inode(x)
-> Impossible, but recover latest dnode(F)

3. CP | inode(x) | dnode(F)
-> Need to write inode(DF) in f2fs_sync_file.

4. CP | dnode(F) | inode(DF)
-> If f2fs_iget fails, then goto next.

5. CP | dnode(F) | inode(x)
-> If f2fs_iget fails, then goto next. But, this is an impossible scenario.
Drop this dnode(F).

Indeed, there were some missing scenarios.

So, how about this patch?