Re: [PATCH RFC] net: memcg accounting for veth devices

From: Shakeel Butt
Date: Tue Mar 01 2022 - 13:09:29 EST


On Mon, Feb 28, 2022 at 06:36:58AM -0800, Luis Chamberlain wrote:
On Mon, Feb 28, 2022 at 10:17:16AM +0300, Vasily Averin wrote:
> Following one-liner running inside memcg-limited container consumes
> huge number of host memory and can trigger global OOM.
>
> for i in `seq 1 xxx` ; do ip l a v$i type veth peer name vp$i ; done
>
> Patch accounts most part of these allocations and can protect host.
> ---[cut]---
> It is not polished, and perhaps should be splitted.
> obviously it affects other kind of netdevices too.
> Unfortunately I'm not sure that I will have enough time to handle it properly
> and decided to publish current patch version as is.
> OpenVz workaround it by using per-container limit for number of
> available netdevices, but upstream does not have any kind of
> per-container configuration.
> ------

Should this just be a new ucount limit on kernel/ucount.c and have veth
use something like inc_ucount(current_user_ns(), current_euid(), UCOUNT_VETH)?

This might be abusing ucounts though, not sure, Eric?


For admins of systems running multiple workloads, there is no easy way
to set such limits for each workload. Some may genuinely need more veth
than others. From admin's perspective it is preferred to have minimal
knobs to set and if these objects are charged to memcg then the memcg
limits would limit them. There was similar situation for inotify
instances where fs sysctl inotify/max_user_instances already limits the
inotify instances but we memcg charged them to not worry about setting
such limits. See ac7b79fd190b ("inotify, memcg: account inotify
instances to kmemcg").