Re: [PATCH] fs, mm: account filp and names caches to kmemcg

From: Michal Hocko
Date: Tue Oct 24 2017 - 08:19:12 EST


Does this sound something that you would be interested in? I can spend
som more time on it if it is worthwhile.

On Fri 13-10-17 17:24:21, Michal Hocko wrote:
> Well, it actually occured to me that this would trigger the global oom
> killer in case no memcg specific victim can be found which is definitely
> not something we would like to do. This should work better. I am not
> sure we can trigger this corner case but we should cover it and it
> actually doesn't make the code much worse.
> ---
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index d5f3a62887cf..7b370f070b82 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1528,26 +1528,40 @@ static void memcg_oom_recover(struct mem_cgroup *memcg)
>
> static void mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
> {
> - if (!current->memcg_may_oom)
> - return;
> /*
> * We are in the middle of the charge context here, so we
> * don't want to block when potentially sitting on a callstack
> * that holds all kinds of filesystem and mm locks.
> *
> - * Also, the caller may handle a failed allocation gracefully
> - * (like optional page cache readahead) and so an OOM killer
> - * invocation might not even be necessary.
> + * cgroup v1 allowes sync users space handling so we cannot afford
> + * to get stuck here for that configuration. That's why we don't do
> + * anything here except remember the OOM context and then deal with
> + * it at the end of the page fault when the stack is unwound, the
> + * locks are released, and when we know whether the fault was overall
> + * successful.
> + *
> + * On the other hand, in-kernel OOM killer allows for an async victim
> + * memory reclaim (oom_reaper) and that means that we are not solely
> + * relying on the oom victim to make a forward progress so we can stay
> + * in the the try_charge context and keep retrying as long as there
> + * are oom victims to select.
> *
> - * That's why we don't do anything here except remember the
> - * OOM context and then deal with it at the end of the page
> - * fault when the stack is unwound, the locks are released,
> - * and when we know whether the fault was overall successful.
> + * Please note that mem_cgroup_oom_synchronize might fail to find a
> + * victim and then we have rely on mem_cgroup_oom_synchronize otherwise
> + * we would fall back to the global oom killer in pagefault_out_of_memory
> */
> + if (!memcg->oom_kill_disable &&
> + mem_cgroup_out_of_memory(memcg, mask, order))
> + return true;
> +
> + if (!current->memcg_may_oom)
> + return false;
> css_get(&memcg->css);
> current->memcg_in_oom = memcg;
> current->memcg_oom_gfp_mask = mask;
> current->memcg_oom_order = order;
> +
> + return false;
> }
>
> /**
> @@ -2007,8 +2021,11 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>
> mem_cgroup_event(mem_over_limit, MEMCG_OOM);
>
> - mem_cgroup_oom(mem_over_limit, gfp_mask,
> - get_order(nr_pages * PAGE_SIZE));
> + if (mem_cgroup_oom(mem_over_limit, gfp_mask,
> + get_order(nr_pages * PAGE_SIZE))) {
> + nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
> + goto retry;
> + }
> nomem:
> if (!(gfp_mask & __GFP_NOFAIL))
> return -ENOMEM;
> --
> Michal Hocko
> SUSE Labs

--
Michal Hocko
SUSE Labs