AKA it should be this fix that removes the need for your dumpable setting.
bfedb589252c ("mm: Add a user_ns owner to mm_struct and fix ptrace permission checks")
I will check, though from what I recall that patch doesn't fix the
ptrace_may_access checks. Not to mention it won't help if the
container doesn't have it's own user namespace.
That change very much is to the ptrace_may_access checks.
You are not playing with setgroups if you don't have your own user
namespace. So I don't see how the other cases are relevant.
What I think I would do in the situation you describe is to
join what you are going to join. Limit yourself to creating pid
namespaces with unshare.
If you are joining a user namespace set undumpable.
If you are creating a user namespace create it and then set undumpable.
Clearing dumpable is to help not leak things
into a container when you call setns on a user namespace.
It is also to help not leak things into a container when you join
other namespaces. Most notably the PID namespace.
Except that you don't strictly join a PID namespace. You set a context
for children to run in a different PID namespace. So you are safe
from PID namespaces as long as you don't call fork.
+ if (mode != (S_IFDIR|S_IRUGO|S_IXUGO)) {
I'd just like to draw your attention to this special case -- why is
this special cased? What was the original reasoning behind it? Does it
make sense for a non-dumpable process to allow someone to change the
mode of some random /proc/[pid]/ directories?
This has nothing at all to do with changing modes and is all about
what uid/gid are set on the proc inode. Usually it is the uid/gid
of the process in question but occassionally for undumpable processes
it is root/root to prevent people from accessing the files in question.
I get the feeling that some of this logic is a bit iffy.
It looks like I forgot to carry forward the comment that explains that
case in my patch. Something I need to fix before I merge it.
/*
* Before the /proc/pid/status file was created the only way to read
* the effective uid of a /process was to stat /proc/pid. Reading
* /proc/pid/status is slow enough that procps and other packages
* kept stating /proc/pid. To keep the rules in /proc simple I have
* made this apply to all per process world readable and executable
* directories.
*/
Or in short. I broke ps when I removed all of the special cases, and to
fix ps I added the existing special case. Not that the uid or gid of a
directory that the whole world can access matters.