Re: [PATCH] configfs: Silence lockdep on mkdir(), rmdir() andconfigfs_depend_item()

From: Louis Rilling
Date: Fri Dec 19 2008 - 05:29:25 EST


On 18/12/08 14:58 -0800, Joel Becker wrote:
> On Thu, Dec 18, 2008 at 01:28:28PM +0100, Peter Zijlstra wrote:
> > In fact, both (configfs) mkdir and rmdir seem to synchronize on
> > su_mutex..
> >
> > mkdir B/C/bar
> >
> > C.i_mutex
> > su_mutex
> >
> > vs
> >
> > rmdir foo
> >
> > parent(foo).i_mutex
> > foo.i_mutex
> > su_mutex
> >
> >
> > once holding the rmdir su_mutex you can check foo's user-content, since
> > any mkdir will be blocked. All you have to do is then re-validate in
> > mkdir's su_mutex that !IS_DEADDIR(C).
>
> We explicitly do not take any i_mutex locks after taking
> su_mutex. That's an ABBA risk. su_mutex protects the hierarchy of
> config_items. i_mutex protects the vfs view thereof.
> If you look in mkdir, we take su_mutex, get a new item from the
> client subsystem, then drop su_mutex. After that, we go about building
> our filesystem structure, using i_mutex where appropriate. More
> importantly is rmdir(2), where we use i_mutex in
> configfs_detach_group(), but are not holding su_sem. Only when
> configfs_detach_group() has successfully returned and we have torn down
> the filesystem structure do we take su_mutex and tear down the
> config_item structure.
> In fact, we're part of the way there. Check out that
> USET_DROPPING flag we set in detach_prep() while scanning for user
> objects. That flags us racing mkdir(2). When we are done with
> detach_prep(), we know that mkdir(2) calls racing behind us will do
> nothing until we safely lock them out with the locking in
> detach_group(). All mkdir(2) calls will have exited by the time we get
> the mutex, and no new mkdir(2) call can start because we have the mutex.
> Now look in detach_groups(). We drop the groups children before
> marking them DEAD. Louis' plan, I think, is to perhaps mark a group
> DEAD, disconnect it from the vfs, and then operate on its children. In
> this fashion, perhaps we can unlock the trailing lock like a normal VFS
> operation.
> This will require some serious auditing, however, because now
> vfs functions can get into the vfs objects behind us. And more vfs
> changes affect us. Whereas the current locking relies on the vfs's
> parent->child lock ordering only, something that isn't likely to be
> changed.

I've thought about such plan, but I'm not comfortable enough with the VFS to
tell how it could be done precisely, and whether it is safe to remove a whole
tree from the dcache by just unlinking its root. In particular, how could we
deal with racing operations under default groups? Should we setup a link from
any default group to its youngest non-default group ancestor? As Steven
suggested, looking at unmount might be interesting, but not today as far as I
am concerned.

Louis

--
Dr Louis Rilling Kerlabs
Skype: louis.rilling Batiment Germanium
Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes
http://www.kerlabs.com/ 35700 Rennes

Attachment: signature.asc
Description: Digital signature