Re: OOM killer not nearly agressive enough?

From: Michal Hocko
Date: Thu Jan 09 2020 - 16:59:04 EST


On Thu 09-01-20 13:46:04, Vito Caputo wrote:
> On Thu, Jan 09, 2020 at 10:03:07PM +0100, Pavel Machek wrote:
> > On Thu 2020-01-09 12:56:33, Michal Hocko wrote:
> > > On Tue 07-01-20 21:44:12, Pavel Machek wrote:
> > > > Hi!
> > > >
> > > > I updated my userspace to x86-64, and now chromium likes to eat all
> > > > the memory and bring the system to standstill.
> > > >
> > > > Unfortunately, OOM killer does not react:
> > > >
> > > > I'm now running "ps aux", and it prints one line every 20 seconds or
> > > > more. Do we agree that is "unusable" system? I attempted to do kill
> > > > from other session.
> > >
> > > Does sysrq+f help?
> >
> > May try that next time.
> >
> > > > Do we agree that OOM killer should have reacted way sooner?
> > >
> > > This is impossible to answer without knowing what was going on at the
> > > time. Was the system threshing over page cache/swap? In other words, is
> > > the system completely out of memory or refaulting the working set all
> > > the time because it doesn't fit into memory?
> >
> > Swap was full, so "completely out of memory", I guess. Chromium does
> > that fairly often :-(.
> >
>
> Have you considered restricting its memory limits a la `ulimit -m`?

The kernel ignores RLIMIT_RSS. Unless the browser takes it into
consideration then I do not see how that would help.

> I've taken to running browsers in nspawn containers for general
> isolation improvements, but this also makes it easy to set cgroup
> resource limits like memcg. i.e. --property MemoryMax=2G

Yes, this should help to isolate the problem.

> This prevents the browser from bogging down the entire system, but it
> doesn't prevent thrashing before FF OOMs within its control group.
>
> I do feel there's a problem with the kernel's reclaim algorithm, it
> seems far too willing to evict file-backed pages that are recently in
> use.

It is true that the memory reclaim is quite page cache reclaim biased
unless there is very small amount of the page cache. Page cache refault
is considered during the reclaim but I am afraid that there are still
corner cases where the workload might end up threshing. Be it on the
page cache or the anonymous memory depending on the workload. Anyway
getting data from real workloads is always good so that we can think on
improving existing heuristics.

--
Michal Hocko
SUSE Labs