Re: [patch 9/9] mm: keep page cache radix tree nodes in check

From: Johannes Weiner
Date: Thu Jan 16 2014 - 17:11:10 EST

On Wed, Jan 15, 2014 at 01:55:01PM +0800, Bob Liu wrote:
> Hi Johannes,
> On 01/11/2014 02:10 AM, Johannes Weiner wrote:
> > Previously, page cache radix tree nodes were freed after reclaim
> > emptied out their page pointers. But now reclaim stores shadow
> > entries in their place, which are only reclaimed when the inodes
> > themselves are reclaimed. This is problematic for bigger files that
> > are still in use after they have a significant amount of their cache
> > reclaimed, without any of those pages actually refaulting. The shadow
> > entries will just sit there and waste memory. In the worst case, the
> > shadow entries will accumulate until the machine runs out of memory.
> >
> I have one more question. It seems that other algorithm only remember
> history information of a limit number of evicted pages where the number
> is usually the same as the total cache or memory size.
> But in your patch, I didn't see a preferred value that how many evicted
> pages' history information should be recorded. It all depends on the
> workingset_shadow_shrinker?

That "same as total cache" number is a fairly arbitrary cut-off that
defines how far we record eviction history. For this patch set, we
technically do not need more shadow entries than active pages, but
strict enforcement would be very expensive. So we leave it mostly to
refaults and inode reclaim to keep the number of shadow entries low,
with the shadow shrinker as an emergency backup. Keep in mind that
the shadow entries represent that part of the working set that exceeds
available memory. So the only way the number of shadow entries
exceeds the number of RAM pages in the system is if your workingset is
more than twice that of memory, otherwise the shadow entries refault
before they can accumulate. And because of inode reclaim, that huge
working set would have to be backed by a very small number of files,
otherwise the shadow entries are reclaimed along with the inodes. But
this theoretical workload would be entirely IO bound and a few extra
MB wasted on shadow entries should make no difference.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at