Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

From: Suren Baghdasaryan
Date: Thu Aug 20 2020 - 11:57:26 EST


On Thu, Aug 20, 2020 at 7:53 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>
> Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> writes:
>
> > On 2020/08/20 23:00, Christian Brauner wrote:
> >> On Thu, Aug 20, 2020 at 10:48:43PM +0900, Tetsuo Handa wrote:
> >>> On 2020/08/20 22:34, Christian Brauner wrote:
> >>>> On Thu, Aug 20, 2020 at 03:26:31PM +0200, Michal Hocko wrote:
> >>>>> If you can handle vfork by other means then I am all for it. There were
> >>>>> no patches in that regard proposed yet. Maybe it will turn out simpler
> >>>>> then the heavy lifting we have to do in the oom specific code.
> >>>>
> >>>> Eric's not wrong. I fiddled with this too this morning but since
> >>>> oom_score_adj is fiddled with in a bunch of places this seemed way more
> >>>> code churn then what's proposed here.
> >>>
> >>> I prefer simply reverting commit 44a70adec910d692 ("mm, oom_adj: make sure
> >>> processes sharing mm have same view of oom_score_adj").
> >>>
> >>> https://lore.kernel.org/patchwork/patch/1037208/
> >>
> >> I guess this is a can of worms but just or the sake of getting more
> >> background: the question seems to be whether the oom adj score is a
> >> property of the task/thread-group or a property of the mm. I always
> >> thought the oom score is a property of the task/thread-group and not the
> >> mm which is also why it lives in struct signal_struct and not in struct
> >> mm_struct. But
> >>
> >> 44a70adec910 ("mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj")
> >>
> >> reads like it is supposed to be a property of the mm or at least the
> >> change makes it so.
> >
> > Yes, 44a70adec910 is trying to go towards changing from a property of the task/thread-group
> > to a property of mm. But I don't think we need to do it at the cost of "__set_oom_adj() latency
> > Yong-Taek Lee and Tim Murray have reported" and "complicity for supporting
> > vfork() => __set_oom_adj() => execve() sequence".
>
> The thing is commit 44a70adec910d692 ("mm, oom_adj: make sure processes
> sharing mm have same view of oom_score_adj") has been in the tree for 4
> years.
>
> That someone is just now noticing a regression is their problem. The
> change is semantics is done and decided. We can not reasonably revert
> at this point without risking other regressions.
>
> Given that the decision has already been made to make oom_adj
> effectively per mm. There is no point on have a debate if we should do
> it.

Catching up on the discussion which was going on while I was asleep...
So it sounds like there is a consensus that oom_adj should be moved to
mm_struct rather than trying to synchronize it among tasks sharing mm.
That sounds reasonable to me too. Michal answered all the earlier
questions about this patch, so I won't be reiterating them, thanks
Michal. If any questions are still lingering about the original patch
I'll be glad to answer them.

>
> Eric
>
>