[RFC PATCH 0/3] new mode 'shadow' for /proc/PID/setgroups

From: Snaipe
Date: Mon May 10 2021 - 12:02:51 EST


"Giuseppe Scrivano" <gscrivan@xxxxxxxxxx> writes:
> This series is based on some old patches I've been playing with some
> years ago, but they were never sent to lkml as I was not sure about
> their complexity/usefulness ratio. It was recently reported by
> another user that these patches are still useful[1] so I am submitting
> the last version and see what other folks think about this feature.

For context, the reason why these patches are useful to us is that our
use-case of user namespaces includes running executables within an
{u,g}id space controlled by the calling user, while still remaining in
the original root filesystem.

For example, we use user namespaces through one of our tools[1] as a
substitute to fakeroot in order to build software that would otherwise
need root permission to package, like sudo or ping, where setting the
setuid bit or more importantly file capabilities are necessary.

In these use-cases, still respecting the original group membership is
actually desired. Not just because of the negative-access permission
issues, but because it's possible to lose legitimate access to files
if using setgroups while holding membership of unmapped GIDs. This can
be very surprising behaviour, especially when, as an example, the caller
suddenly loses access to their current working directory after entering
a user namespace and using setgroups.

I've seen other solutions to the original problem mentioned, like
introducing a new sysctl to convey that the system does not use negative-
access permissions -- I believe these alternate solutions do not solve
my second point about losing legitimate access, while this patchset does.
I've tested an older version of these patches and they have all of the
desired properties:

$ id
uid=1000(snaipe) gid=1000(snaipe) groups=1000(snaipe),998(wheel)

$ bst grep . /proc/self/uid_map /proc/self/gid_map /proc/self/setgroups
/proc/self/uid_map: 0 1000 1
/proc/self/uid_map: 1 100000 65536
/proc/self/gid_map: 0 1000 1
/proc/self/gid_map: 1 100000 65536
/proc/self/setgroups:shadow

$ ls -l
total 8
drwxr-xr-x 2 root wheel 4096 Apr 23 14:18 allowed
drwx---r-x 2 root wheel 4096 Apr 23 14:18 denied

$ bst sh -c 'id; ls allowed denied'
uid=0(root) gid=0(root) groups=0(root)
allowed:
ls: cannot open directory 'denied': Permission denied

$ bst --groups 1 sh -c 'id; ls allowed denied'
uid=0(root) gid=0(root) groups=0(root),1(daemon)
allowed:
ls: cannot open directory 'denied': Permission denied

Ultimately, we want to make it safe to run our tool as an unprivileged user,
and while we're currently riding the status-quo of "safe-to-use-but-not-
if-you're-using-negative-permissions", having a way for us to do the right
thing -- without relying on the gotcha that a system administrator must
configure a system knob to make it safe -- is quite attractive.

[1]: https://github.com/aristanetworks/bst

--
Snaipe