Re: [PATCH] oom_kill: use rss value instead of vm size for badness

From: David Rientjes
Date: Thu Dec 03 2009 - 18:25:31 EST


On Wed, 2 Dec 2009, KOSAKI Motohiro wrote:

> - I mean you don't need almost kernel heuristic. but desktop user need it.

My point is that userspace needs to be able to identify memory leaking
tasks and polarize oom killing priorities. /proc/pid/oom_adj does a good
job of both with total_vm as a baseline.

> - All job scheduler provide memory limitation feature. but OOM killer isn't
> for to implement memory limitation. we have memory cgroup.

Wrong, the oom killer implements cpuset memory limitations.

> - if you need memory usage based know, read /proc/{pid}/statm and write
> /proc/{pid}/oom_priority works well probably.

Constantly polling /proc/pid/stat and updating the oom killer priorities
at a constant interval is a ridiculous proposal for identifying memory
leakers, sorry.

> - Unfortunatelly, We can't continue to use VSZ based heuristics. because
> modern application waste 10x VSZ more than RSS comsumption. in nowadays,
> VSZ isn't good approximation value of RSS. There isn't any good reason to
> continue form desktop user view.
>

Then leave the heuristic alone by default so we don't lose any
functionality that we once had and then add additional heuristics
depending on the environment as determined by the manipulation of a new
tunable.

> IOW, kernel hueristic should adjust to target majority user. we provide a knob
> to help minority user.
>

Moving the baseline to rss severely impacts the legitimacy of that knob,
we lose a lot of control over identifying memory leakers and polarizing
oom killer priorities because it depends on the state of the VM at the
time of oom for which /proc/pid/oom_adj may not have recently been updated
to represent.

I don't know why you continuously invoke the same arguments to completely
change the baseline for the oom killer heuristic because you falsely
believe that killing the task with the largest memory resident in RAM is
more often than not the ideal task to kill. It's very frustrating when
you insist on changing the default heuristic based on your own belief that
people use Linux in the same way you do.

If Andrew pushes the patch to change the baseline to rss
(oom_kill-use-rss-instead-of-vm-size-for-badness.patch) to Linus, I'll
strongly nack it because you totally lack the ability to identify memory
leakers as defined by userspace which should be the prime target for the
oom killer. You have not addressed that problem, you've merely talked
around it, and yet the patch unbelievably still sits in -mm.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/