Re: Remove BKL from FAT/VFAT/MSDOS (v1) (was Re: Fw: Regressioncaused by bf726e "semaphore: fix,")

From: Linus Torvalds
Date: Thu May 29 2008 - 17:37:53 EST




On Thu, 29 May 2008, Tony Luck wrote:
>
> This causes a lockup when overwriting an existing file
> on a vfat filesystem. E.g. for me (where there already
> exists a vmlinux.gz file in the target directory):

Thanks, that was something I didn't test. I just tested various simple
file create/delete/read/write.

> Looking at fat_setattr() I see it calls lock_super() at the start and releases
> it at the end. In between is the call to inode_setattr() ... which calls
> down through inode_setattr() and vmtruncate() to fat_truncate() ...
> which calls lock_super() again. Deadlock.

Yup.

And as far as I can tell there is absolutely nothing in fat_setattr() that
we actually want to protect - it calls things like fat_cont_expand(), but
that just calls down to the generic VFS layer that uses the pagecache
functions, and those need to handle the locking correctly for other
reasons (totally unrelated to setattr - they get called without locking
for regular writes, after all).

So it looks like the correct fix is to just remove the lock_super() in
fat_setattr() entirely (along with the "sb" variable that is then no
longer used).

It *also* turns out that we should remove the lock_super() from
fat_truncate: all the truncate paths already hold the inode mutex from the
VFS layer, so the inode data structures themselves would be serialized for
other reasons. And it only protects the call to "fat_free()", which in
turn calls the FAT cluster routines ("fatent") that already are protected
by the fatent spinlock.

More importantly, if it's a filesystem marked DIRSYNC, it will call
"fat_sync_inode()", which takes the superblock lock (and needs it, because
it's called frm the VFS layer too) and would deadlock there.

So the end result of that is all the "lock_kernel()" calls in
fs/fat/file.c should actually just go away - not be replaced by
lock_super() at alL!

Jonathan, do you want an updated replacement patch, or an incremental one,
or will you just do that trivial fix yourself?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/