Re: Linux killed Kenny, bastard!

From: Evgeniy Polyakov
Date: Tue Jan 13 2009 - 19:53:58 EST


On Tue, Jan 13, 2009 at 04:32:41PM -0800, David Rientjes (rientjes@xxxxxxxxxx) wrote:
> > Out of curiousity, how feature can be used, if no one except hardcore
> > kernel hackers know how to work with it? I do not insult, no, I'm really
> > curious. This may explain, why admins I worked with about this issue did
> > not fully succeeded with tuning.
>
> You read the code.

Not the best solution actually and at least not the simplest :)

> You're treating each oom constraint like they are on the same; in a
> cpuset-constrained oom, which can be much more common than system-wide
> unconstrained ooms, we want to target a task that will allow for future
> memory freeing in that cpuset.

I do not break the way oom problem is addressed currently. I just
extend it from the different angle, which can not be resolved in some
cases. Those who do not need the feature, can safely disable the check
and rely on the old algorithms.

What you are talking here is a different problem. Completely different
case. We may have the problem case you described, and we will think on
how to resolve the issue, but approach used in the patch does not
enforce the new policy, it extends it adding new global tunable.

> So in these cases, to avoid needlessly killing your victim, you would be
> forced to set oom_victim_name to NULL. That's hardly useful if the same
> problem you're trying to fix still exists both globally and within a
> cpuset. Your patch doesn't address this use case, so it's already
> incomplete.

Incorrect point of view. If administrator wants to select victim task by
name, this patch allows this. If he does not want this, it is turned
off. There are perfectly split areas where each approach applies: global
name-based selection and more specific to the areas where problem
arises.

> In a mempolicy-constrained oom as the result of MPOL_BIND, which can also
> be much more common than system-wide unconstrained ooms, we want to target
> current because it has allocations from the bound nodes. Your patch
> doesn't touch this path, so it's already inconsistent.

And again wrong conclusion: patch is intended to work in the area it was
created for. It is the simplest (and the only btw) solution for the showed
problem. In the systems where it is not needed, it will not be used and
old algorithms will work fine, apparently since no one proposed it
before, other areas work ok without it. While the problem (quite common
actually) I showed was not addressed at all and all proposed solutions
just failed if we start checking requrements more precisely.

> I'm comfortable that this patch will not be merged, so I'll silently point
> to past posts for the duration of this thread. I definitely think the
> documentation can be improved and I don't think you'll have any opposition
> to sane heuristic changes that also rely on userspace input via
> /proc/pid/oom_adj. Thank you for working on this!

So you agree that no existing solution can solve the oom problem when
parent task has to stay and children have to be differentiated. And
agree that not merging the solution for this commonly happened problem
is the right way. And while we can reread the whole thread multiple
times, we will find again and again that proposed approaches do not
work. This patch does its simple task without breaking others and in the
systems where this feature is not needed, it can be safely turned off,
while still fixing the problem for those who care.

I will update documentation tomorrow if there will be no patches.

--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/