Re: [External] Re: [PATCH 0/9] Shrink the list lru size on memory cgroup removal

From: Muchun Song
Date: Fri May 07 2021 - 01:46:38 EST


On Wed, May 5, 2021 at 9:13 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>
> On Mon, May 03, 2021 at 02:33:21PM +0800, Muchun Song wrote:
> > On Mon, May 3, 2021 at 7:58 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > > > If the user wants to insert the allocated object to its lru list in
> > > > the feature. The
> > > > user should use list_lru_kmem_cache_alloc() instead of kmem_cache_alloc().
> > > > I have looked at the code closely. There are 3 different kmem_caches that
> > > > need to use this new API to allocate memory. They are inode_cachep,
> > > > dentry_cache and radix_tree_node_cachep. I think that it is easy to migrate.
> > >
> > > It might work, but I think you may have overlooked the complexity
> > > of inode allocation for filesystems. i.e. alloc_inode() calls out
> > > to filesystem allocation functions more often than it allocates
> > > directly from the inode_cachep. i.e. Most filesystems provide
> > > their own ->alloc_inode superblock operation, and they allocate
> > > inodes out of their own specific slab caches, not the inode_cachep.
> >
> > I didn't realize this before. You are right. Most filesystems
> > have their own kmem_cache instead of inode_cachep.
> > We need a lot of filesystems special to be changed.
> > Thanks for your reminder.
> >
> > >
> > > And then you have filesystems like XFS, where alloc_inode() will
> > > never be called, and implement ->alloc_inode as:
> > >
> > > /* Catch misguided souls that try to use this interface on XFS */
> > > STATIC struct inode *
> > > xfs_fs_alloc_inode(
> > > struct super_block *sb)
> > > {
> > > BUG();
> > > return NULL;
> > > }
> > >
> > > Because all the inode caching and allocation is internal to XFS and
> > > VFS inode management interfaces are not used.
> > >
> > > So I suspect that an external wrapper function is not the way to go
> > > here - either internalising the LRU management into the slab
> > > allocation or adding the memcg code to alloc_inode() and filesystem
> > > specific routines would make a lot more sense to me.
> >
> > Sure. If we introduce kmem_cache_alloc_lru, all filesystems
> > need to migrate to kmem_cache_alloc_lru. I cannot figure out
> > an approach that does not need to change filesystems code.
>
> Right, I don't think there's a way to avoid changing all the
> filesystem code if we are touching the cache allocation routines.
> However, if we hide it all inside the allocation routine, then
> the changes to each filesystem is effectively just a 1-liner like:
>
> - inode = kmem_cache_alloc(inode_cache, GFP_NOFS);
> + inode = kmem_cache_alloc_lru(inode_cache, sb->s_inode_lru, GFP_NOFS);
>
> Or perhaps, define a generic wrapper function like:
>
> static inline void *
> alloc_inode_sb(struct superblock *sb, struct kmem_cache *cache, gfp_flags_t gfp)
> {
> return kmem_cache_alloc_lru(cache, sb->s_inode_lru, gfp);
> }

Good idea. I am doing this. A preliminary patch is expected next week.

Thanks.

>
> And then each filesystem ends up with:
>
> - inode = kmem_cache_alloc(inode_cache, GFP_NOFS);
> + inode = alloc_inode_sb(sb, inode_cache, GFP_NOFS);
>
> so that all the superblock LRU stuff is also hidden from the
> filesystems...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx