Re: [PATCH RFC] memcg: notify about global mem_cgroup_id space depletion

From: Vasily Averin
Date: Sun Jun 26 2022 - 03:11:38 EST


On 6/26/22 04:56, Roman Gushchin wrote:
> On Sat, Jun 25, 2022 at 05:04:27PM +0300, Vasily Averin wrote:
>> Currently host owner is not informed about the exhaustion of the
>> global mem_cgroup_id space. When this happens, systemd cannot
>> start a new service, but nothing points to the real cause of
>> this failure.
>>
>> Signed-off-by: Vasily Averin <vvs@xxxxxxxxxx>
>> ---
>> mm/memcontrol.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index d4c606a06bcd..5229321636f2 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -5317,6 +5317,7 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
>> 1, MEM_CGROUP_ID_MAX + 1, GFP_KERNEL);
>> if (memcg->id.id < 0) {
>> error = memcg->id.id;
>> + pr_notice_ratelimited("mem_cgroup_id space is exhausted\n");
>> goto fail;
>> }
>
> Hm, in this case it should return -ENOSPC and it's a very unique return code.
> If it's not returned from the mkdir() call, we should fix this.
> Otherwise it's up to systemd to handle it properly.
>
> I'm not opposing for adding a warning, but parsing dmesg is not how
> the error handling should be done.

I'm agree, I think it's a good idea. Moreover I think it makes sense to
use -ENOSPC when the local cgroup's limit is reached.
Currently cgroup_mkdir() returns -EAGAIN, this looks strange for me.

if (!cgroup_check_hierarchy_limits(parent)) {
ret = -EAGAIN;
goto out_unlock;
}

Thank you,
Vasily Averin