Re: Mount structures are leaked

From: Andrei Vagin
Date: Thu Jun 08 2017 - 18:48:24 EST


On Thu, Jun 8, 2017 at 2:37 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Jun 08, 2017 at 01:49:38PM -0700, Andrei Vagin wrote:
>> Hello,
>>
>> We found that mount structures are leaked on the upstream linux kernel:
>>
>> [root@zdtm criu]# cat /proc/slabinfo | grep mnt
>> mnt_cache 36456 36456 384 42 4 : tunables 0 0
>> 0 : slabdata 868 868 0
>> [root@zdtm criu]# python test/zdtm.py run -t zdtm/static/env00
>> --iter 10 -f ns
>> === Run 1/1 ================ zdtm/static/env00
>>
>> ========================= Run zdtm/static/env00 in ns ==========================
>> Start test
>> ./env00 --pidfile=env00.pid --outfile=env00.out --envname=ENV_00_TEST
>> Run criu dump
>> Run criu restore
>> Run criu dump
>> ....
>> Run criu restore
>> Send the 15 signal to 339
>> Wait for zdtm/static/env00(339) to die for 0.100000
>> Removing dump/zdtm/static/env00/31
>> ========================= Test zdtm/static/env00 PASS ==========================
>> [root@zdtm criu]# cat /proc/slabinfo | grep mnt
>> mnt_cache 36834 36834 384 42 4 : tunables 0 0
>> 0 : slabdata 877 877 0
>>
>> [root@zdtm linux]# git describe HEAD
>> v4.12-rc4-122-gb29794e
>>
>> [root@zdtm ~]# uname -a
>> Linux zdtm.openvz.org 4.12.0-rc4+ #2 SMP Thu Jun 8 20:49:01 CEST 2017
>> x86_64 x86_64 x86_64 GNU/Linux
>
> For fsck sake... Andrei, you *do* know better.
> 1) I have no idea what setup do you have - e.g. whether you have mount event
> propagation set up in a way that ends up with mounts accumulating somewhere.
> 2) I have no idea what those scripts are and names don't look descriptive
> enough to google for them in hope to find out (nor the version of those scripts,
> if there had been more than one)
> 3) I have no idea which config do you have.
> 4) I have no idea which kernel is that about, other than "rc4 with something
> on top of it"
> 5) I have no idea how that had behaved on other kernels (or how that was
> supposed to behave in the first place)
>
> So it boils down to "we've done something, it has given a result we didn't expect,
> the kernel must've been broken". About the only thing I can suggest at that point is
> telnet bofh.jeffballard.us 666
> and see if it provides an inspiration...

Hi All,

I'm sorry for this stripped report. I continue investigating this
issue and soon I will provide more info about it.

I found that there is one more suspicious slab:

[root@zdtm criu]# cat /proc/slabinfo | grep ^kmalloc-32
kmalloc-32 49024 49152 32 128 1 : tunables 0 0
0 : slabdata 384 384 0

I tried to use the kmemleak detector, but it reports nothing useful in
this case.

test/zdtm.py is a script, which we use to execute criu tests:
https://github.com/xemul/criu

$ python test/zdtm.py run -t zdtm/static/env00 -f ns --iter 10

This command executes a test in a new set on namespaces (net, mnt,
pid, ipc, uts), then it dumps and restores this test container ten
times and check that everything works as expected. The env00 test sets
an environment variable and then check that the variable has the same
value after c/r.