Re: [PATCH v6 1/2] mm: introduce process_mrelease system call

From: Suren Baghdasaryan
Date: Thu Aug 05 2021 - 11:30:12 EST


On Thu, Aug 5, 2021 at 12:10 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Wed 04-08-21 11:50:03, Suren Baghdasaryan wrote:
> [...]
> > +SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags)
> > +{
> > +#ifdef CONFIG_MMU
> > + struct mm_struct *mm = NULL;
> > + struct task_struct *task;
> > + unsigned int f_flags;
> > + struct pid *pid;
> > + long ret = 0;
> > +
> > + if (flags)
> > + return -EINVAL;
> > +
> > + pid = pidfd_get_pid(pidfd, &f_flags);
> > + if (IS_ERR(pid))
> > + return PTR_ERR(pid);
> > +
> > + task = get_pid_task(pid, PIDTYPE_PID);
> > + if (!task) {
> > + ret = -ESRCH;
> > + goto put_pid;
> > + }
> > +
> > + /*
> > + * If the task is dying and in the process of releasing its memory
> > + * then get its mm.
> > + */
> > + task = find_lock_task_mm(task);
>
> You want a different task_struct because the returned one might be
> different from the given one and you already hold a reference which you
> do not want to leak

Ah, right. I was looking at the task locking and find_lock_task_mm()
handles that but I missed the task pinning part. Will fix.

>
> > + if (!task) {
> > + ret = -ESRCH;
> > + goto put_pid;
> > + }
> > + if (task_will_free_mem(task) && (task->flags & PF_KTHREAD) == 0) {
> > + mm = task->mm;
> > + mmget(mm);
> > + }
> > + task_unlock(task);
> > + if (!mm) {
> > + ret = -EINVAL;
> > + goto put_task;
> > + }
> > +
> > + if (test_bit(MMF_OOM_SKIP, &mm->flags))
> > + goto put_mm;
>
> This is too late to check for MMF_OOM_SKIP. task_will_free_mem will fail
> with the flag being set. I believe you want something like the
> following:
>
> p = find_lock_task_mm(task);
> mm = p->mm;
>
> /* The work has been done already */
> if (test_bit(MMF_OOM_SKIP, &mm->flags)) {
> task_unlock(p);
> goto put_task;
> }
>
> i
> if (!task_will_free_mem(p)) {
> task_unlock(p);
> goto put_task;
> }
>
> mmget(mm);
> task_unlock(p);
>

I see. Let me update the patch and will ask Andrew to remove the
previous version from mm tree.
Thanks for reviewing and pointing out the issues!

>
> > +
> > + if (mmap_read_lock_killable(mm)) {
> > + ret = -EINTR;
> > + goto put_mm;
> > + }
> > + if (!__oom_reap_task_mm(mm))
> > + ret = -EAGAIN;
> > + mmap_read_unlock(mm);
> > +
> > +put_mm:
> > + mmput(mm);
> > +put_task:
> > + put_task_struct(task);
> > +put_pid:
> > + put_pid(pid);
> > + return ret;
> > +#else
> > + return -ENOSYS;
> > +#endif /* CONFIG_MMU */
> > +}
> > --
> > 2.32.0.554.ge1b32706d8-goog
>
> --
> Michal Hocko
> SUSE Labs