Re: [merged] mm-memcg-handle-non-error-oom-situations-more-gracefully.patchremoved from -mm tree

From: David Rientjes
Date: Mon Dec 02 2013 - 17:51:48 EST


On Mon, 2 Dec 2013, Michal Hocko wrote:

> I guess we need to know how much is significantly less.
> oom_scan_process_thread already aborts on exiting tasks so we do not
> kill anything and then the charge (whole page fault actually) is retried
> when we check for the OOM again so my intuition would say that we gave
> the exiting task quite a lot of time.
>

That isn't the race, though. The race occurs when the oom killed process
exits prior to the process iteration so it's not detected and yet its
memory has already been freed and the memcg is no longer oom. In other
words, a process that has called mem_cgroup_oom_synchronize() at the same
time that an oom killed process has freed its memory. The result is an
unnecessary oom killing and erroneous spam in the kernel log.

We all agree that this race cannot be completely closed (at least without
synchronization in the uncharge path that we obviously don't want to add).
We don't know if an oom killed process, or any process, will free its
memory immediately after the kernel sends the SIGKILL. However, there's
absolutely no reason to not have a final check immediately before sending
the SIGKILL to prevent that unnecessary oom kill.

I'm going to send the patch for review.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/