Re: [RFC][PATCH 4/4] memcg: make oom less frequently

From: Daisuke Nishimura
Date: Thu Jan 08 2009 - 21:32:22 EST


On Fri, 9 Jan 2009 11:03:58 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> On Fri, 9 Jan 2009 10:44:16 +0900
> Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx> wrote:
>
> > > To handle live-lock situation as "reclaimed memory is stolen very soon",
> > > should we check signal_pending(current) or some flags ?
> > >
> > > IMHO, using jiffies to detect how long we should retry is easy to understand
> > > ....like
> > > "if memory charging cannot make progress for XXXX minutes,
> > > trigger some notifier or show some flag to user via cgroupfs interface.
> > > to show we're tooooooo busy."
> > >
> > Good Idea.
> >
> > But I think it would be enough for now to check signal_pending(curren) and
> > return -ENOMEM.
> >
> > How about this one?
>
> Hmm, looks much simpler.
>
> > ===
> > From: Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx>
> >
> > In previous implementation, mem_cgroup_try_charge checked the return
> > value of mem_cgroup_try_to_free_pages, and just retried if some pages
> > had been reclaimed.
> > But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it)
> > only checks whether the usage is less than the limit.
> >
> > This patch tries to change the behavior as before to cause oom less frequently.
> >
> >
> > Signed-off-by: Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx>
> > ---
> > mm/memcontrol.c | 14 ++++++++++----
> > 1 files changed, 10 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index dc38a0e..2ab0a5c 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -770,10 +770,10 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> > * but there might be left over accounting, even after children
> > * have left.
> > */
> > - ret = try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap,
> > + ret += try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap,
> > get_swappiness(root_mem));
> > if (mem_cgroup_check_under_limit(root_mem))
> > - return 0;
> > + return 1; /* indicate reclaim has succeeded */
> > if (!root_mem->use_hierarchy)
> > return ret;
> >
> > @@ -784,10 +784,10 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> > next_mem = mem_cgroup_get_next_node(root_mem);
> > continue;
> > }
> > - ret = try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap,
> > + ret += try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap,
> > get_swappiness(next_mem));
> > if (mem_cgroup_check_under_limit(root_mem))
> > - return 0;
> > + return 1; /* indicate reclaim has succeeded */
> > next_mem = mem_cgroup_get_next_node(root_mem);
> > }
> > return ret;
> > @@ -870,8 +870,13 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
> > if (!(gfp_mask & __GFP_WAIT))
> > goto nomem;
> >
> > + if (signal_pending(current))
> > + goto oom;
> > +
>
> I think it's better to avoid to add this check *now*. and "signal is pending"
> doesn't mean oom situation.
>
hmm.. charge is assumed to return 0 or -ENOMEM, what should we return on
signal_pending case ?

In case of shmem for example, if charge at shmem_getpage fails by -ENOMEM,
shmem_fault returns VM_FAULT_OOM, so pagefault_out_of_memory would be called.
If memcg had not invoked oom-killer, system wide oom would be invoked.

> Hmm..Maybe we can tell "please retry page fault again, it's too long latency in
> memory reclaim and you received signal." in future.
>
OK.

> IMHO, only quick path which we can add here now is
> ==
> if (test_thread_flag(TIG_MEMDIE)) { /* This thread is killed by OOM */
> *memcg = NULL;
> return 0;
> }
> ==
> like this.
>
> Anyway, please discuss this "quick exit path" in other patch and just remove
> siginal check.
>
> Other part looks ok to me.
>
Thanks :)

I'll update this one by removing the signal_pendign check.


Thanks,
Daisuke Nishimura.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/