Re: [PATCH] mm: fadvise: Drain all pagevecs if POSIX_FADV_DONTNEEDfails to discard all pages

From: Michal Hocko
Date: Fri Feb 15 2013 - 11:48:34 EST


On Fri 15-02-13 17:14:10, Rob van der Heij wrote:
> On 15 February 2013 12:04, Michal Hocko <mhocko@xxxxxxx> wrote:
> > On Thu 14-02-13 12:39:26, Andrew Morton wrote:
> >> On Thu, 14 Feb 2013 12:03:49 +0000
> >> Mel Gorman <mgorman@xxxxxxx> wrote:
> >>
> >> > Rob van der Heij reported the following (paraphrased) on private mail.
> >> >
> >> > The scenario is that I want to avoid backups to fill up the page
> >> > cache and purge stuff that is more likely to be used again (this is
> >> > with s390x Linux on z/VM, so I don't give it as much memory that
> >> > we don't care anymore). So I have something with LD_PRELOAD that
> >> > intercepts the close() call (from tar, in this case) and issues
> >> > a posix_fadvise() just before closing the file.
> >> >
> >> > This mostly works, except for small files (less than 14 pages)
> >> > that remains in page cache after the face.
> >>
> >> Sigh. We've had the "my backups swamp pagecache" thing for 15 years
> >> and it's still happening.
> >>
> >> It should be possible nowadays to toss your backup application into a
> >> container to constrain its pagecache usage. So we can type
> >>
> >> run-in-a-memcg -m 200MB /my/backup/program
> >>
> >> and voila. Does such a script exist and work?
> >
> > The script would be as simple as:
> > cgcreate -g memory:backups/`whoami`
> > cgset -r memory.limit_in_bytes=200MB backups/`whoami`
> > cgexec -g memory:backups/`whoami` /my/backup/program
> >
> > It just expects that admin sets up backups group which allows the user
> > to create a subgroup (w permission on the directory) and probably set up
> > some reasonable cap for all backups
>
> Cool. This is promising enough to bridge my skills gap. It appears to
> work as promised, but I would have to understand why it takes
> significantly more CPU than my ugly posix_fadvise() call on close...

I would guess that a lot of reclaim would be an answer. Note that each
memcg has its own LRU and the limit is neforced by the per group
reclaim.
I wouldn't expect the difference to be very big, though. What do you
mean by significantly more?

--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/