Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid

From: Jaegeuk Kim
Date: Thu Aug 20 2015 - 11:35:51 EST


On Thu, Aug 20, 2015 at 05:12:03PM +0800, Chao Yu wrote:
> Hi Jaegeuk,
>
> > -----Original Message-----
> > From: Jaegeuk Kim [mailto:jaegeuk@xxxxxxxxxx]
> > Sent: Tuesday, August 18, 2015 4:46 PM
> > To: linux-kernel@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx;
> > linux-f2fs-devel@xxxxxxxxxxxxxxxxxxxxx
> > Cc: Jaegeuk Kim
> > Subject: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
> >
> > This patch adds a routine which checks the block address of newly allocated nid.
> > If an nid has already allocated by other thread due to subtle data races, it
> > will result in filesystem corruption.
> > So, it needs to check whether its block address was already allocated or not
> > in prior to nid allocation as the last chance.
> >
> > Signed-off-by: Jaegeuk Kim <jaegeuk@xxxxxxxxxx>
> > ---
> > fs/f2fs/node.c | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > index 3cc32b8..6bef5a2 100644
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -1573,6 +1573,8 @@ retry:
> >
> > /* We should not use stale free nids created by build_free_nids */
> > if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
> > + struct node_info ni;
> > +
> > f2fs_bug_on(sbi, list_empty(&nm_i->free_nid_list));
> > list_for_each_entry(i, &nm_i->free_nid_list, list)
> > if (i->state == NID_NEW)
> > @@ -1583,6 +1585,13 @@ retry:
> > i->state = NID_ALLOC;
> > nm_i->fcnt--;
> > spin_unlock(&nm_i->free_nid_list_lock);
> > +
> > + /* check nid is allocated already */
> > + get_node_info(sbi, *nid, &ni);
> > + if (ni.blk_addr != NULL_ADDR) {
>
> I didn't get it, why free nid is with non-NULL blkaddr?
> Could you please explain more about this?

As I wrote in the description, I've been suffering from wrongly added free nids
which results in fs corruption. I suspect somewhat race condition in
build_free_nids, but it is very subtle to figure out exactly.
So, I wrote this patch to fix that.

The concern would be performance regarding to cold cache miss at an NAT entry.
However, I expect that it would be tolerable since get_node_info will be called
after alloc_nid later.

>
> > + alloc_nid_done(sbi, *nid);
>
> Will another thread call alloc_nid_done too, making this free nid being
> released again?

No, its state became NID_ALLOC, so no other thread can pick this up till
alloc_nid_done is called.

Thanks,

>
> Thanks,
>
> > + goto retry;
> > + }
> > return true;
> > }
> > spin_unlock(&nm_i->free_nid_list_lock);
> > --
> > 2.1.1
> >
> >
> > ------------------------------------------------------------------------------
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@xxxxxxxxxxxxxxxxxxxxx
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/