Re: inodes: Support generic defragmentation

From: Dave Chinner
Date: Fri Jan 29 2010 - 21:43:54 EST


On Fri, Jan 29, 2010 at 02:49:42PM -0600, Christoph Lameter wrote:
> This implements the ability to remove inodes in a particular slab
> from inode caches. In order to remove an inode we may have to write out
> the pages of an inode, the inode itself and remove the dentries referring
> to the node.
>
> Provide generic functionality that can be used by filesystems that have
> their own inode caches to also tie into the defragmentation functions
> that are made available here.
>
> FIXES NEEDED!
>
> Note Miklos comments on the patch at http://lkml.indiana.edu/hypermail/linux/kernel/0810.1/2003.html
>
> The way we obtain a reference to a inode entry may be unreliable since inode
> refcounting works in different ways. Also a reference to the superblock is necessary
> in order to be able to operate on the inodes.
>
> Cc: Miklos Szeredi <miklos@xxxxxxxxxx>
> Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxx>
> Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>
> Reviewed-by: Rik van Riel <riel@xxxxxxxxxx>
> Signed-off-by: Christoph Lameter <clameter@xxxxxxx>
> Signed-off-by: Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx>
>
> ---
> fs/inode.c | 123 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/fs.h | 6 ++
> 2 files changed, 129 insertions(+)
>
> Index: linux-2.6/fs/inode.c
> ===================================================================
> --- linux-2.6.orig/fs/inode.c 2010-01-29 12:03:04.000000000 -0600
> +++ linux-2.6/fs/inode.c 2010-01-29 12:03:25.000000000 -0600
> @@ -1538,6 +1538,128 @@ static int __init set_ihash_entries(char
> __setup("ihash_entries=", set_ihash_entries);
>
> /*
> + * Obtain a refcount on a list of struct inodes pointed to by v. If the
> + * inode is in the process of being freed then zap the v[] entry so that
> + * we skip the freeing attempts later.
> + *
> + * This is a generic function for the ->get slab defrag callback.
> + */
> +void *get_inodes(struct kmem_cache *s, int nr, void **v)
> +{
> + int i;
> +
> + spin_lock(&inode_lock);
> + for (i = 0; i < nr; i++) {
> + struct inode *inode = v[i];
> +
> + if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE))
> + v[i] = NULL;
> + else
> + __iget(inode);
> + }
> + spin_unlock(&inode_lock);
> + return NULL;
> +}
> +EXPORT_SYMBOL(get_inodes);

How do you expect defrag to behave when the filesystem doesn't free
the inode immediately during dispose_list()? That is, the above code
only finds inodes that are still active at the VFS level but they
may still live for a significant period of time after the
dispose_list() call. This is a real issue now that XFS has combined
the VFS and XFS inodes into the same slab...

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/