Re: Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure

From: Vlastimil Babka
Date: Thu Aug 08 2019 - 10:47:23 EST


On 8/7/19 10:51 PM, Johannes Weiner wrote:
> From 9efda85451062dea4ea287a886e515efefeb1545 Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@xxxxxxxxxxx>
> Date: Mon, 5 Aug 2019 13:15:16 -0400
> Subject: [PATCH] psi: trigger the OOM killer on severe thrashing

Thanks a lot, perhaps finally we are going to eat the elephant ;)

I've tested this by booting with mem=8G and activating browser tabs as
long as I could. Then initially the system started thrashing and didn't
recover for minutes. Then I realized sysrq+f is disabled... Fixed that
up after next reboot, tried lower thresholds, also started monitoring
/proc/pressure/memory, and found out that after minutes of not being
able to move the cursor, both avg10 and avg60 shows only around 15 for
both some and full. Lowered thrashing_oom_level to 10 and (with
thrashing_oom_period of 5) the thrashing OOM finally started kicking,
and the system recovered by itself in reasonable time.

So my conclusion is that the patch works, but there's something odd with
suspiciously low PSI memory values on my system. Any idea how to
investigate this? Also, does it matter that it's a modern desktop, so
systemd puts everything into cgroups, and the unified cgroup2 hierarchy
is also mounted?

Thanks,
Vlastimil