Re: [BUGFIX][PATCH] oom-kill: fix NUMA consraint check withnodemask v4.2

From: KAMEZAWA Hiroyuki
Date: Mon Dec 14 2009 - 20:35:30 EST


On Mon, 14 Dec 2009 17:16:32 -0800
Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

>
> So I have a note-to-self here that these patches:
>
> oom_kill-use-rss-value-instead-of-vm-size-for-badness.patch
> oom-kill-show-virtual-size-and-rss-information-of-the-killed-process.patch
> oom-kill-fix-numa-consraint-check-with-nodemask-v42.patch
>
> are tentative and it was unclear whether I should merge them.
>
> What do we think?
>

In my view,
oom-kill-show-virtual-size-and-rss-information-of-the-killed-process.patch
- should be merged. Because we tend to get several OOM reports in a month,
More precise information is always welcomed.

oom-kill-fix-numa-consraint-check-with-nodemask-v42.patch
- should be merged. This is a bug fix.

oom_kill-use-rss-value-instead-of-vm-size-for-badness.patch
- should not be merged.
I'm now preparing more counters for mm's statistics. It's better to
wait and to see what we can do more. And other patches for total
oom-killer improvement is under development.

And, there is a compatibility problem.
As David says, this may break some crazy software which uses
fake_numa+cpuset+oom_killer+oom_adj for resource controlling.
(even if I recommend them to use memcg rather than crazy tricks...)

2 ideas which I can think of now are..
1) add sysctl_oom_calc_on_committed_memory
If this is set, use vm-size instead of rss.

2) add /proc/<pid>/oom_guard_size
This allows users to specify "valid/expected size" of a task.
When
#echo 10M > /proc/<pid>/oom_guard_size
At OOM calculation, 10Mbytes is subtracted from rss size.
(The best way is to estimate this automatically from vm_size..but...)



Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/