Re: [PATCH RFC] Add locking to ext3_do_update_inode
From: Jan Kara
Date: Mon Sep 07 2009 - 18:30:34 EST
On Fri 04-09-09 16:06:13, Chris Mason wrote:
> I've been struggling with this off and on while I've been testing the
> data=guarded work. The symptom is corrupted orphan lists and inodes
> with the wrong i_size stored on disk. I was convinced the
> data=guarded code was just missing a call to ext3_mark_inode_dirty, but
> tracing showed the i_disksize I was sending to ext3_mark_inode_dirty
> wasn't actually making it to the drive.
> ext3_mark_inode_dirty can be called without locks held (atime updates
> and a few others), so the data=guarded code uses locks while updating
> the in-memory inode, and then calls ext3_mark_inode_dirty
> without any locks held.
> But, ext3_mark_inode_dirty has no internal locking to make sure that
> only one CPU is updating the buffer head at a time. Generally this
> works out ok because everyone that changes the inode then calls
> ext3_mark_inode_dirty themselves. Even though it races, eventually
> someone updates the buffer heads and things move on.
> But there is still a risk of the wrong values getting in, and the
> data=guarded code seems to hit the race very often.
> Since everyone that changes the inode also logs it, it should be
> possible to fix this with some memory barriers. I'll leave that as an
> exercise to the reader and lock the buffer head instead.
One more thing - Ted, I believe ext4 needs a similar patch.
> It it probably a good idea to have a different patch series for lockless
> bit flipping on the ext3 i_state field. ext3_do_update_inode &= clears
> EXT3_STATE_NEW without any locks held.
Yeah, the locking around handling of i_state and i_flags is kind of
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/