Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks

From: Michal Hocko
Date: Mon Sep 13 2021 - 04:53:41 EST


On Fri 10-09-21 15:39:28, Vasily Averin wrote:
> The kernel currently allows dying tasks to exceed the memcg limits.
> The allocation is expected to be the last one and the occupied memory
> will be freed soon.
> This is not always true because it can be part of the huge vmalloc
> allocation. Allowed once, they will repeat over and over again.
> Moreover lifetime of the allocated object can differ from
> In addition the lifetime of the dying task.
> Multiple such allocations running concurrently can not only overuse
> the memcg limit, but can lead to a global out of memory and,
> in the worst case, cause the host to panic.
>
> Signed-off-by: Vasily Averin <vvs@xxxxxxxxxxxxx>
> ---
> mm/memcontrol.c | 23 +++++------------------
> 1 file changed, 5 insertions(+), 18 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 389b5766e74f..67195fcfbddf 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1834,6 +1834,9 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
> return OOM_ASYNC;
> }
>
> + if (should_force_charge())
> + return OOM_SKIPPED;

mem_cgroup_out_of_memory already check for the bypass, now you are
duplicating that check with a different answer to the caller. This is
really messy. One of the two has to go away.

--
Michal Hocko
SUSE Labs