Re: [PATCH v4 2/2] cgroups: add a pids subsystem

From: Austin S Hemmelgarn
Date: Wed Mar 11 2015 - 11:14:03 EST


On 2015-03-10 08:31, Aleksa Sarai wrote:
Hi Austin,

Does pids limit make sense in the root cgroup?

I would say it kind of does, although I would just expect it to track
/proc/sys/kernel/pid_max (either as a read-only value, or as an
alternative way to set it).

Personally, that seems unintuitive. /proc/sys/kernel/pid_max and the pids
cgroup controller are orthogonal features, why should they be able to
affect each other (or even be aware of each other)?

I wouldn't consider them entirely orthogonal, the sysctl value is the
limiting factor for the maximal value that can be set in a given pids
cgroup. Setting an unlimited value in the cgroup is functionally identical
to setting it to be equal to /proc/sys/kernel/pid_max, and the root cgroup
is functionally equivalent to /proc/sys/kernel/pid_max, because all tasks
that aren't in another cgroup get put in the root.

While it is true that /proc/sys/kernel/pid_max would be functionally equivalent
to setting pids.max to the value of /proc/sys/kernel/pid_max (and thus the pids
root cgroup is functionally equivalent to the parent), it is untrue that the
sysctl value is the limiting factor on what "max" is defined as. "max" is
defined as the maximum possible pid_t value (it's really the only sane maximum
value, because trying to use /proc/sys/kernel/pid_max would be problematic due
to the fact that the maximum limit would keep changing and the line between
"max" and some arbitrary value would be blurred). In addition, the sysctl value
limits the number of pids in the system in a separate part of the kernel -- it
has nothing to do with cgroups and cgroups have nothing to do with it.

I did not necessarily word this very clearly. What I meant is that /proc/sys/kernel/pid_max is essentially an external limiting factor that caps the total number of pids that can be under the root cgroup and it's children, not that the cgroup in any way payed attention to it. It might be useful to be able to just disable the sysctl option and set the value through the root cgroup, solely or consistency, although such usage isn't something I would consider essential in any way.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/