Re: [PATCH] oom-kill: add lowmem usage aware oom kill handling

From: KAMEZAWA Hiroyuki
Date: Thu Jan 21 2010 - 20:09:52 EST


On Fri, 22 Jan 2010 09:40:17 +0900
Minchan Kim <minchan.kim@xxxxxxxxx> wrote:

> On Fri, Jan 22, 2010 at 8:48 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> > On Fri, 22 Jan 2010 00:18:44 +0900
> > Minchan Kim <minchan.kim@xxxxxxxxx> wrote:
> >
> >> Hi, Kame.
> >>
> >> On Thu, 2010-01-21 at 14:59 +0900, KAMEZAWA Hiroyuki wrote:
> >> > A patch for avoiding oom-serial-killer at lowmem shortage.
> >> > Patch is onto mmotm-2010/01/15 (depends on mm-count-lowmem-rss.patch)
> >> > Tested on x86-64/SMP + debug module(to allocated lowmem), works well.
> >> >
> >> > ==
> >> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> >> >
> >> > One cause of OOM-Killer is memory shortage in lower zones.
> >> > (If memory is enough, lowmem_reserve_ratio works well. but..)
> >> >
> >> > In lowmem-shortage oom-kill, oom-killer choses a vicitim process
> >> > on their vm size. But this kills a process which has lowmem memory
> >> > only if it's lucky. At last, there will be an oom-serial-killer.
> >> >
> >> > Now, we have per-mm lowmem usage counter. We can make use of it
> >> > to select a good? victim.
> >> >
> >> > This patch does
> >> > Â - add CONSTRAINT_LOWMEM to oom's constraint type.
> >> > Â - pass constraint to __badness()
> >> > Â - change calculation based on constraint. If CONSTRAINT_LOWMEM,
> >> > Â Â use low_rss instead of vmsize.
> >>
> >> As far as low memory, it would be better to consider lowmem counter.
> >> But as you know, {vmsize VS rss} is debatable topic.
> >> Maybe someone doesn't like this idea.
> >>
> > About lowmem, vmsize never work well.
> >
>
> Tend to agree with you.
> I am just worried about "vmsize lovers".
>
> You removed considering vmsize totally.
> In case of LOWMEM, lowcount considering make sense.
> But never considering vmsize might be debatable.
>
> So personllay, I thouhg we could add more weight lowcount
> in case of LOWMEM. But I chaged my mind.
> I think it make OOM heurisic more complated without big benefit.
>
thanks. I don't want patch-drop again, either :)

> Simple is best.
>
> >> So don't we need any test result at least?
> > My test result was very artificial, so I didn't attach the result.
> >
> > Â- Before this patch, sshd was killed at first.
> > Â- After this patch, memory consumer of low-rss was killed.
>
> Okay. You already anwsered my question by Balbir's reply.
> I had a question it's real problem and how often it happens.
>
> >
> >> If we don't have this patch, it happens several innocent process
> >> killing. but we can't prevent it by this patch.
> >>
> > I can't catch what you mean.
>
> I just said your patch's benefit.
>
> >> Sorry for bothering you.
> >>
> >
> > Hmm, boot option or CONFIG ? (CONFIG_OOMKILLER_EXTENSION ?)
> >
> > I'm now writing fork-bomb detector again and want to remove current
> > "gathering child's vm_size" heuristics. I'd like to put that under
> > the same config, too.
>
> Totally, I don't like CONFIG option for that.
> But vmsize lovers also don't want to change current behavior.
> So it's desirable until your fork-form detector become mature and
> prove it's good.
>
Hmm, Okay, I'll add some. Kosaki told me sysctl is better. I'll check
how it looks.

> One more questions about below.
>
> + if (constraint != CONSTRAINT_LOWMEM) {
> + list_for_each_entry(child, &p->children, sibling) {
> + task_lock(child);
> + if (child->mm != mm && child->mm)
> + points += child->mm->total_vm/2 + 1;
> + task_unlock(child);
> + }
>
> Why didn't you consider child's lowmem counter in case of LOWMEM?
>
Assume process A, B, C, D. B and C are children of A.

A (low_rss = 0)
B (low_rss = 20)
C (low_rss = 20)
D (low_rss = 20)

When we caluculate A's socre by above logic, A's score may be greater than
B and C, D. We do targetted oom-kill as sniper, not as genocider. So, ignoreing
children here is better, I think.
I'll add some explanation to changelog.

Thanks,
-Kame





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/