Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

From: Michal Hocko
Date: Fri Aug 21 2020 - 03:21:54 EST


On Thu 20-08-20 23:39:25, Eric W. Biederman wrote:
> Michal Hocko <mhocko@xxxxxxxx> writes:
>
> > On Thu 20-08-20 08:56:53, Suren Baghdasaryan wrote:
> > [...]
> >> Catching up on the discussion which was going on while I was asleep...
> >> So it sounds like there is a consensus that oom_adj should be moved to
> >> mm_struct rather than trying to synchronize it among tasks sharing mm.
> >> That sounds reasonable to me too. Michal answered all the earlier
> >> questions about this patch, so I won't be reiterating them, thanks
> >> Michal. If any questions are still lingering about the original patch
> >> I'll be glad to answer them.
> >
> > I think it still makes some sense to go with a simpler (aka less tricky)
> > solution which would be your original patch with an incremental fix for
> > vfork and the proper ordering (http://lkml.kernel.org/r/20200820124109.GI5033@xxxxxxxxxxxxxx)
> > and then make a more complex shift to mm struct on top of that. The
> > former will be less tricky to backport to stable IMHO.
>
> So I am confused.
>
> I don't know how a subtle dependency on something in clone
> is better than something flat footed in exec.

Well, I think that setting a flag is an easier approach than handle all
the special cases for the mm thing. But this is likely because this is
not my domain so my judgment call might be misled. Anyway if there is a
general consensus that doing the middle step is not worth it I am not
going to object.

> That said if we are going for a small change why not:
>
> /*
> * Make sure we will check other processes sharing the mm if this is
> * not vfrok which wants its own oom_score_adj.
> * pin the mm so it doesn't go away and get reused after task_unlock
> */
> if (!task->vfork_done) {
> struct task_struct *p = find_lock_task_mm(task);
>
> if (p) {
> - if (atomic_read(&p->mm->mm_users) > 1) {
> + if (atomic_read(&p->mm->mm_users) > p->signal->nr_threads) {
> mm = p->mm;
> mmgrab(mm);
> }
> task_unlock(p);
> }
> }

I remember playing with something like that but it had problems too. I
do not remember details. Oleg would know better.

--
Michal Hocko
SUSE Labs