Re: [PATCH v2] mm/oom_kill: count global and memory cgroup oom kills

From: Michal Hocko
Date: Thu Jun 08 2017 - 05:44:50 EST


On Mon 05-06-17 17:27:50, Konstantin Khlebnikov wrote:
>
>
> On 05.06.2017 11:50, Michal Hocko wrote:
> >On Thu 25-05-17 13:28:30, Konstantin Khlebnikov wrote:
[...]
> >>index 04c9143a8625..dd30a045ef5b 100644
> >>--- a/mm/oom_kill.c
> >>+++ b/mm/oom_kill.c
> >>@@ -876,6 +876,11 @@ static void oom_kill_process(struct oom_control *oc, const char *message)
> >> /* Get a reference to safely compare mm after task_unlock(victim) */
> >> mm = victim->mm;
> >> mmgrab(mm);
> >>+
> >>+ /* Raise event before sending signal: reaper must see this */
> >>+ count_vm_event(OOM_KILL);
> >>+ mem_cgroup_count_vm_event(mm, OOM_KILL);
> >>+
> >> /*
> >> * We should send SIGKILL before setting TIF_MEMDIE in order to prevent
> >> * the OOM victim from depleting the memory reserves from the user
> >
> >Why don't you count tasks which share mm with the oom victim?
>
> Yes, this makes sense. But these kills are not logged thus counter
> will differs from logged events.

Yes they are not but does that matter? Do we want _all_ or only some oom
kills being counted.

> Also these tasks might live in different cgroups, so counting to mm
> owner isn't correct.

Well, the situation with mm shared between different memcgs is always
hairy. We try to charge mm->owner but I suspect we are not consistent in
that. I would have to double check because it's been a long ago since
I've investigated that. My point is that once you count OOM kills you
should count all the tasks IMHO.

--
Michal Hocko
SUSE Labs