Re: [PATCH RFC] xfs, memcg: Call xfs_fs_nr_cached_objects() only in case of global reclaim

From: Dave Chinner
Date: Thu Mar 15 2018 - 19:05:04 EST


On Thu, Mar 15, 2018 at 10:28:43PM +0300, Kirill Tkhai wrote:
> On 15.03.2018 20:49, Michal Hocko wrote:
> > On Thu 15-03-18 18:01:34, Kirill Tkhai wrote:
> >> xfs_reclaim_inodes_count(XFS_M(sb)) does not care about memcg.
> >> So, it's called for memcg reclaim too, e.g. this list is shrinked
> >> disproportionality to another lists.
> >>
> >> This looks confusing, so I'm reporting about this.
> >> Consider this patch as RFC.
> >
> > Could you be more specific about the problem you are trying to solve?
> > Because we do skip shrinkers which are not memcg aware by
> > shrink_slab:
> > /*
> > * If kernel memory accounting is disabled, we ignore
> > * SHRINKER_MEMCG_AWARE flag and call all shrinkers
> > * passing NULL for memcg.
> > */
> > if (memcg_kmem_enabled() &&
> > !!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE))
> > continue;
> >
> > Or am I missing something?
>
> sb->s_op->nr_cached_objects is a sub-method of generic super_cache_count().
> super_cache_count() is owned and only called by superblock's shrinker,
> which does have SHRINKER_MEMCG_AWARE flag.

Xfs inodes are accounted to memcgs when they are allocated. All the
memcg reclaim stuff is done at the VFS inode cache level - all the
XFS inode cache shrinker does is clean up inodes that are not
referenced by the VFS inode cache because the memcg aware reclaim
has already freed them.

i.e. what the XFS inode cache is doing is perfectly reasonable -
memcg aware inode reclaim is occuring at the VFS level, but on XFS
that does not free the inodes as they are still referenced
internally by XFS. However, once the inode is removed from the VFS
LRU, all memcg information about the inode is destroyed, so there's
nothing in the XFS layers that cares about memcgs.

Hence when the XFS inode shrinker then called to run a
garbage collection pass on unreferenced inodes - the inodes that
are now unreferenced in the memcg due to the VFS inode shrinker pass
- it frees inodes regardless of the memcg context it was called from
because that information is no longer present in the inode cache.
Hence we just ignore memcgs at this level.

So, is there a problem you are actually trying to fix, or is this
simply a "I don't understand how the superblock shrinkers work,
please explain" patch?

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx