problem with accounting of allocations called from __net_init hooks

From: Vasily Averin
Date: Mon Apr 11 2022 - 05:41:45 EST


On 3/1/22 21:09, Shakeel Butt wrote:
> On Mon, Feb 28, 2022 at 06:36:58AM -0800, Luis Chamberlain wrote:
>> On Mon, Feb 28, 2022 at 10:17:16AM +0300, Vasily Averin wrote:
>> > Following one-liner running inside memcg-limited container consumes
>> > huge number of host memory and can trigger global OOM.
>> >
>> > for i in `seq 1 xxx` ; do ip l a v$i type veth peer name vp$i ; done
>> >
>> > Patch accounts most part of these allocations and can protect host.
>> > ---[cut]---
>> > It is not polished, and perhaps should be splitted.
>> > obviously it affects other kind of netdevices too.
>> > Unfortunately I'm not sure that I will have enough time to handle it properly
>> > and decided to publish current patch version as is.
>> > OpenVz workaround it by using per-container limit for number of
>> > available netdevices, but upstream does not have any kind of
>> > per-container configuration.
>> > ------

I've noticed that __register_pernet_operations() executes init hook of registered
pernet_operation structure in all found net namespaces.

Usually these hooks are called by process related to specified net namespace,
and all marked allocation are accounted to related container:
i.e. objects related to netns in container A are accounted to memcg of container A,
objects allocated inside container B are accounted to corresponding memcg B,
and so on.

However __register_pernet_operations() calls the same hooks in one context,
and as result all marked allocations are accounted to one memcg.
It is quite rare scenario, however current processing looks incorrect for me.

I expect we can take memcg from 'struct net', because of this structure is accounted per se.
then we can use set_active_memcg() before init hook execution.
However I'm not sure it is fully correct.

Could you please advise some better solution?

Thank you,
Vasily Averin