Re: ext2 filesystem corruption?!?!??

Theodore Y. Ts'o (tytso@MIT.EDU)
Wed, 2 Apr 1997 09:15:17 -0500


Date: Wed, 2 Apr 1997 09:49:57 +0000 (GMT)
From: Mark Hemment <markhe@nextd.demon.co.uk>

I'm not convinced your patch removes the corruption problem.
Simply locking the inode at the start of clear_inode() does not prevent
another process gaining a referernce to the inode (via __iget() or
get_empty_inode()). This is the race.

In the original fs/inode.c, get_empty_inode() will try to avoid
selecting inodes that are locked. In fact, I think it was supposed to
avoid them altogether, but there was a case (which I think was a bug)
where it might conceivably select an inode that was locked. My patch
fixed that, so that get_empty_inode will never select an inode where
i_lock is non-zero.

Similarly, __iget(), will wait until the until the inode is unlocked,
and then double check to make sure that the inode number hasn't changed.
(Get_empty_inode has some nonsense code where the inode number is set to
an monotonically increasing number; which is silly since the filesystem
dependent code will be setting the inode number to some real value
anyway.)

So *think* your concerns have been addressed. Again, it didn't seem to
solve the AFS corruption problem, though.

The logic behind most of the fs/inode.c code is that when a "maintainance"
operation is performed on an inode it needs locking, _and_ needs a
reference increment for that operation! This is the reason why I saw a
race in sync_inodes(). As far as I can tell, the only other operation
which does not inc the count is the call to write_inode() in
get_empty_inode().

No, I don't think so. lock_inode() is supposed to wait until the inode
is unlocked, and then grab the lock. There shouldn't be any cases
where the inode is locked twice (otherwise what would be the point of
locking it? :-)

- Ted