Re: [PATCH 12/12] vmscan: Do not writeback pages in direct reclaim

From: KAMEZAWA Hiroyuki
Date: Tue Jun 15 2010 - 20:35:35 EST

On Tue, 15 Jun 2010 14:54:08 +0100
Mel Gorman <mel@xxxxxxxxx> wrote:

> On Tue, Jun 15, 2010 at 09:37:27AM -0400, Christoph Hellwig wrote:
> > On Tue, Jun 15, 2010 at 09:34:18AM -0400, Rik van Riel wrote:
> > > If direct reclaim can overflow the stack, so can direct
> > > memcg reclaim. That means this patch does not solve the
> > > stack overflow, while admitting that we do need the
> > > ability to get specific pages flushed to disk from the
> > > pageout code.
> >
> > Can you explain what the hell memcg reclaim is and why it needs
> > to reclaim from random contexts?
> Kamezawa Hiroyuki has the full story here but here is a summary.
Thank you.

> memcg is the Memory Controller cgroup
> (Documentation/cgroups/memory.txt). It's intended for the control of the
> amount of memory usable by a group of processes but its behaviour in
> terms of reclaim differs from global reclaim. It has its own LRU lists
> and kswapd operates on them.

No, we don't use kswapd. But we have some hooks in kswapd for implementing
soft-limit. Soft-limit is for giving a hint for kswapd "please reclaim memory
from this memcg" when global memory exhausts and kswapd runs.

What a memcg use when it his limit is just direct reclaim.
(*) Justfing using a cpu by a kswapd because a memcg hits limit is difficult
for me. So, I don't use kswapd until now.
When direct-reclaim is used, cost-of-reclaim will be charged against
a cpu cgroup which a thread belongs to.

> What is surprising is that direct reclaim
> for a process in the control group also does not operate within the
> cgroup.
Sorry, I can't understand ....

> Reclaim from a cgroup happens from the fault path. The new page is
> "charged" to the cgroup. If it exceeds its allocated resources, some
> pages within the group are reclaimed in a path that is similar to direct
> reclaim except for its entry point.

> So, memcg is not reclaiming from a random context, there is a limited
> number of cases where a memcg is reclaiming and it is not expected to
> overflow the stack.

I think so. Especially, we'll never see 1k stack use of select().

> > It seems everything that has a cg in it's name that I stumbled over
> > lately seems to be some ugly wart..
> >
> The wart in this case is that the behaviour of page reclaim within a
> memcg and globally differ a fair bit.

Sorry. But there has been very long story to reach current implementations.
But don't worry, of memcg is not activated (not mounted), it doesn't affect
the behavior of processes ;)

But Hmm..

>[kamezawa@bluextal mmotm-2.6.35-0611]$ wc -l mm/memcontrol.c
>4705 mm/memcontrol.c

may need some diet :(


