Re: [PATCH 2/2] oom: give bonus to frozen processes

From: Michal Hocko
Date: Mon Sep 26 2011 - 05:54:20 EST


On Mon 26-09-11 18:31:15, KAMEZAWA Hiroyuki wrote:
> On Mon, 26 Sep 2011 02:02:59 -0700 (PDT)
> David Rientjes <rientjes@xxxxxxxxxx> wrote:
>
> > On Mon, 26 Sep 2011, Michal Hocko wrote:
> >
> > > Let's try it with a heuristic change first. If you really do not like
> > > it, we can move to oom_scode_adj. I like the heuristic change little bit
> > > more because it is at the same place as the root bonus.
> >
> > The problem with the bonus is that, as mentioned previously, it doesn't
> > protect against ANYTHING for the case you're trying to fix.

Yes, it just makes this less probable.


> > This won't panic the machine because all killable threads are
> > guaranteed to have a non-zero badness score, but it's a very valid
> > configuration to have either
> >
> > - all eligible threads (system-wide, shared cpuset, shared mempolicy
> > nodes) are frozen, or
> >
> > - all eligible frozen threads use <5% of memory whereas all other
> > eligible killable threads use 1% of available memory.
> >
> > and that means the oom killer will repeatedly select those threads and the
> > livelock still exists unless you can guarantee that they are successfully
> > thawed, that thawing them in all situations is safe, and that once thawed
> > they will make a timely exit.

Yes, this is what the first patch is fixing. Thawed tasks should die
almost immediately because they are on the way to userspace anyway.

> >
> > Additionally, I don't think biasing against frozen tasks makes sense from
> > a heusritic standpoint of the oom killer. Why would we want give
> > non-frozen tasks that are actually getting work done a preference over a
> > task that is frozen and doing absolutely nothing?

Because frozen tasks are in that state usually (not considering suspend
path which has OOM disabled) based on an user request (via freezer
cgroup e.g.). I wouldn't be surprised if somebody relied on the D state
and that the task will not get killer.

> > It seems like that's backwards and that we'd actually prefer killing
> > the task doing nothing so it can free its memory.
> >
>
> I agree with David.
> Why don't you set oom_score_adj as -1000 for processes which never should die ?

It is little bit unintuitive to think about OOM killer when you just
want to debug your frozen application.
On the other hand I agree that adding a new heuristic for an use case
that is not entirely clear and which is not 100% anyway is not good.

So, please scratch this patch and let's wait for somebody with a valid
use case.

> You don't freeze processes via user-land using cgroup ?

That was exactly the use case I had in mind. Somebody using freezer
cgroup to freeze a task to debug it.

>
> Thanks,
> -Kame
>
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/