Re: [PATCH v3 1/2] mm: introduce process_mrelease system call

From: Shakeel Butt
Date: Mon Jul 26 2021 - 09:44:07 EST


On Mon, Jul 26, 2021 at 12:27 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
[...]
>
> Is process_mrelease on all of them really necessary? I thought that the
> primary reason for the call is to guarantee a forward progress in cases
> where the userspace OOM victim cannot die on SIGKILL. That should be
> more an exception than a normal case, no?
>

I am thinking of using this API in this way: On user-defined OOM
condition, kill a job/cgroup and unconditionally reap all of its
processes. Keep monitoring the situation and if it does not improve go
for another kill and reap.

I can add additional logic in between kill and reap to see if reap is
necessary but unconditionally reaping is more simple.

>
> > An alternative would be to have a cgroup specific interface for
> > reaping similar to cgroup.kill.
>
> Could you elaborate?
>

I mentioned this in [1] where I was thinking if it makes sense to
overload cgroup.kill to also add the SIGKILLed processes in
oom_reaper_list. The downside would be that there will be one thread
doing the reaping and the syscall approach allows userspace to reap in
multiple threads. I think for now, I would go with whatever Suren is
proposing and we can always add more stuff if need arises.

[1] https://lore.kernel.org/containers/CALvZod4jsb6bFzTOS4ZRAJGAzBru0oWanAhezToprjACfGm+ew@xxxxxxxxxxxxxx/