Cgroups v2 bug: incorrect reporting in "domain threaded" cgroup.procs file

From: Michael Kerrisk (man-pages)
Date: Wed Nov 07 2018 - 14:46:36 EST


Hello Tejun,

I've discovered what looks to be a bug in the reporting
of PIDs in the cgroup.procs file in the "domain threaded"
node at the root of a threaded subtree. The following
demo is on vanilla kernel 4.19.

Suppose we have the following multithreaded process:

$ ps -L 654
PID LWP TTY STAT TIME COMMAND
654 654 pts/12 Tl 0:00 ./cpu_multithread_burner 100
654 655 pts/12 Tl 0:01 ./cpu_multithread_burner 100

Now suppose we create a threaded subtree in the v2 hierarchy:

# cd /sys/fs/cgroup/unified/
# mkdir -p x/a/b
# echo 'threaded' > x/a/cgroup.type
# echo 'threaded' > x/a/b/cgroup.type

Then we move the multithreaded process into x/a in
the threaded subtree:

# echo 654 > x/a/cgroup.procs

Now we visualize the set-up using my visualization
program[1]:

# go run ~mtk/lsp/cgroups/view_v2_cgroups.go x
x [dt]
PIDs: {654}
a [t]
TIDs: {654 655-[654]}
b [t]

The above is as I expect.

Now, we move the thread group leader (it has to be the
thread group leader to show the bug) to a x/a/b, and again
use my visualization program:

# echo 654 > x/a/b/cgroup.threads
# go run ~mtk/lsp/cgroups/view_v2_cgroups.go x
x [dt]
PIDs: {654 655}
a [t]
TIDs: {655-[654]}
b [t]
TIDs: {654}

Note how the *thread ID* of the non-thread-group-leader
(655) is being reported in the x/cgroup.procs!

And just to verify that this is not a bug in my
visualization program:

# cat x/cgroup.procs
655
654

Your thoughts?

Thanks,

Michael

[1] https://www.spinics.net/lists/cgroups/msg20710.html

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/