Re: [RESEND v12 0/6] cgroup-aware OOM killer

From: Michal Hocko
Date: Thu Oct 19 2017 - 17:09:53 EST


On Thu 19-10-17 15:45:34, Johannes Weiner wrote:
> On Thu, Oct 19, 2017 at 07:52:12PM +0100, Roman Gushchin wrote:
> > This patchset makes the OOM killer cgroup-aware.
>
> Hi Andrew,
>
> I believe this code is ready for merging upstream, and it seems Michal
> is in agreement. There are two main things to consider, however.
>
> David would have really liked for this patchset to include knobs to
> influence how the algorithm picks cgroup victims. The rest of us
> agreed that this is beyond the scope of these patches, that the
> patches don't need it to be useful, and that there is nothing
> preventing anyone from adding configurability later on. David
> subsequently nacked the series as he considers it incomplete. Neither
> Michal nor I see technical merit in David's nack.

agreed

> Michal acked the implementation, but on the condition that the new
> behavior be opt-in, to not surprise existing users.

and just to make it clear I have also said I will _not_ nack if that is
not the case.

> I *think* we agree
> that respecting the cgroup topography during global OOM is what we
> should have been doing when cgroups were initially introduced;

We do not agree here though. I am not convinced that respecting the
cgroup topography is an universal win. It is true that there is no best
OOM victim selection strategy but what we have currently is the simplest
option and as such the most robust one. I can tell from the past year
experience that many of those clever heuristics actually contributed to
lockups and non-deterministic behavior.

> where
> we disagree is that I think users shouldn't have to opt in to
> improvements. We have done much more invasive changes to the victim
> selection without actual regressions in the past. Further, this change
> only applies to mounts of the new cgroup2.

which basically means that the behavior will change under many users
feet because the respecitve cgroup configuration is chosen by somebody
else (e.g. systemd) so I do not really buy "only v2 behavior"

> Tejun also wasn't convinced
> of the risk for regression, and too would prefer cgroup-awareness to
> be the default in cgroup2. I would ask for patch 5/6 to be dropped.

--
Michal Hocko
SUSE Labs