Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

From: Tejun Heo
Date: Sat Jan 20 2018 - 07:33:20 EST


Hello, David.

On Fri, Jan 19, 2018 at 12:53:41PM -0800, David Rientjes wrote:
> Hearing no response, I'll implement this as a separate tunable in a v2
> series assuming there are no better ideas proposed before next week. One
> of the nice things about a separate tunable is that an admin can control
> the overall policy and they can delegate the mechanism (killall vs one
> process) to a user subtree. I agree with your earlier point that killall
> vs one process is a property of the workload and is better defined
> separately.

If I understood your arguments correctly, the reasons that you thought
your selectdion policy changes must go together with Roman's victim
action were two-fold.

1. You didn't want a separate knob for group oom behavior and wanted
it to be combined with selection policy. I'm glad that you now
recognize that this would be the wrong design choice.

2. The current selection policy may be exploited by delegatee and
strictly hierarchical seleciton should be available. We can debate
the pros and cons of different heuristics; however, to me, the
followings are clear.

* Strictly hierarchical approach can't replace the current policy.
It doesn't work well for a lot of use cases.

* OOM victim selection policy has always been subject to changes
and improvements.

I don't see any blocker here. The issue you're raising can and should
be handled separately.

In terms of interface, what makes an interface bad is when the
purposes aren't crystalized enough and different interface pieces fail
to clearnly encapsulate what's actually necessary.

Here, whether a workload can survive being killed piece-wise or not is
an inherent property of the workload and a pretty binary one at that.
I'm not necessarily against changing it to take string inputs but
don't see rationales for doing so yet.

Thanks.

--
tejun