Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() fromdeactivate_locked_super() to filesystems

From: Andrew Morton
Date: Fri Jun 08 2012 - 18:25:53 EST


On Sat, 9 Jun 2012 01:14:46 +0300
"Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:

> On Fri, Jun 08, 2012 at 03:02:53PM -0700, Andrew Morton wrote:
> > On Sat, 9 Jun 2012 00:41:03 +0300
> > "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
> >
> > > There's no reason to call rcu_barrier() on every deactivate_locked_super().
> > > We only need to make sure that all delayed rcu free inodes are flushed
> > > before we destroy related cache.
> > >
> > > Removing rcu_barrier() from deactivate_locked_super() affects some
> > > fas paths. E.g. on my machine exit_group() of a last process in IPC
> > > namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.
> >
> > What an unpleasant patch. Is final-process-exiting-ipc-namespace a
> > sufficiently high-frequency operation to justify the change?

This, please.

> > I don't really understand what's going on here. Are you saying that
> > there is some filesystem against which we run deactivate_locked_super()
> > during exit_group(), and that this filesystem doesn't use rcu-freeing
> > of inodes? The description needs this level of detail, please.

You still haven't explained where this deactivate_locked_super() call
is coming from. Oh well.

> I think the rcu_barrier() is in wrong place. We need it to safely destroy
> inode cache. deactivate_locked_super() is part of umount() path, but all
> filesystems I've checked have inode cache for whole filesystem, not
> per-mount.

Well from a design perspective, putting the rcu_barrier() in the vfs is
the *correct* place. Individual filesystems shouldn't be hard-coding
knowledge about vfs internal machinery.

A neater implementation might be to add a kmem_cache* argument to
unregister_filesystem(). If that is non-NULL, unregister_filesystem()
does the rcu_barrier() and destroys the cache. That way we get to
delete (rather than add) a bunch of code from all filesystems and new
and out-of-tree filesystems cannot forget to perform the rcu_barrier().

> > The implementation would be less unpleasant if we could do the
> > rcu_barrier() in kmem_cache_destroy(). I can't see a way of doing that
> > without adding a dedicated slab flag, which would require editing all
> > the filesystems anyway.
>
> I think rcu_barrier() for all kmem_cache_destroy() would be too expensive.

That is not what I proposed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/