Re: [PATCH 1/2] proc.5: Document /proc/[pid]/setgroups

From: Michael Kerrisk (man-pages)
Date: Thu Feb 12 2015 - 08:53:39 EST


Hello Eric,

On 02/11/2015 02:51 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@xxxxxxxxx> writes:
>
>> Hi Eric,
>>
>> Ping!
>>
>> Cheers,
>>
>> Michael
>
> My apologies. You description wasn't wrong but it may be a bit
> misleading, explanation below. You will have to figure out how to work
> that into your proposed text.
>
>> On 2 February 2015 at 16:36, Michael Kerrisk (man-pages)
>> <mtk.manpages@xxxxxxxxx> wrote:
>>> [Adding Josh to CC in case he has anything to add.]
>>>
>>> On 12/12/2014 10:54 PM, Eric W. Biederman wrote:
>>>>
>>>> Signed-off-by: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
>>>> ---
>>>> man5/proc.5 | 15 +++++++++++++++
>>>> 1 file changed, 15 insertions(+)
>>>>
>>>> diff --git a/man5/proc.5 b/man5/proc.5
>>>> index 96077d0dd195..d661e8cfeac9 100644
>>>> --- a/man5/proc.5
>>>> +++ b/man5/proc.5
>>>> @@ -1097,6 +1097,21 @@ are not available if the main thread has already terminated
>>>> .\" Added in 2.6.9
>>>> .\" CONFIG_SCHEDSTATS
>>>> .TP
>>>> +.IR /proc/[pid]/setgroups " (since Linux 3.19-rc1)"
>>>> +This file reports
>>>> +.BR allow
>>>> +if the setgroups system call is allowed in the current user namespace.
>>>> +This file reports
>>>> +.BR deny
>>>> +if the setgroups system call is not allowed in the current user namespace.
>>>> +This file may be written to with values of
>>>> +.BR allow
>>>> +and
>>>> +.BR deny
>>>> +before
>>>> +.IR /proc/[pid]/gid_map
>>>> +is written to (enabling setgroups) in a user namespace.
>>>> +.TP
>>>> .IR /proc/[pid]/smaps " (since Linux 2.6.14)"
>>>> This file shows memory consumption for each of the process's mappings.
>>>> (The
>>>
>>> Hi Eric,
>>>
>>> Thanks for this patch. I applied it, and then tried to work in
>>> quite a few other details gleaned from the source code and commit
>>> message, and Jon Corbet's article at http://lwn.net/Articles/626665/.
>>> Could you please let me know if the following is correct:
>
> It is close but it may be misleading.
>
>>> /proc/[pid]/setgroups (since Linux 3.19)
>>> This file displays the string "allow" if processes in
>>> the user namespace that contains the process pid are
>>> permitted to employ the setgroups(2) system call, and
>>> "deny" if setgroups(2) is not permitted in that user
>>> namespace.
>
> With the caveat that when gid_map is not set that setgroups is also not
> allowed.

Okay -- Iadded that point.

>>> A privileged process (one with the CAP_SYS_ADMIN capaâ
>>> bility in the namespace) may write either of the strings
>>> "allow" or "deny" to this file before writing a group ID
>>> mapping for this user namespace to the file
>>> /proc/[pid]/gid_map. Writing the string "deny" prevents
>>> any process in the user namespace from employing setâ
>>> groups(2).
>
> Or more succintly. You are allowed to write to /proc/[pid]/setgroups
> when calling setgroups is not allowed because gid_map is unset. This
> ensures we do not have any transitions from a state where setgroups
> is allowed to a state where setgroups is denied. There are only
> transitions from setgroups not-allowed to setgroups allowed.

And I've worked in the above point, rewording a bit along the way.
So, how does the following look (only the first two paragraphs have
changed)?

/proc/[pid]/setgroups (since Linux 3.19)
This file displays the string "allow" if processes in
the user namespace that contains the process pid are
permitted to employ the setgroups(2) system call, and
"deny" if setgroups(2) is not permitted in that user
namespace. (Note, however, that calls to setgroups(2)
are also not permitted if /proc/[pid]/gid_map has not
yet been set.)

A privileged process (one with the CAP_SYS_ADMIN capaâ
bility in the namespace) may write either of the strings
"allow" or "deny" to this file before writing a group ID
mapping for this user namespace to the file
/proc/[pid]/gid_map. Writing the string "deny" prevents
any process in the user namespace from employing setâ
groups(2). In other words, it is permitted to write to
/proc/[pid]/setgroups so long as calling setgroups(2) is
not allowed because /proc/[pid]gid_map has not been set.
This ensures that a process cannot transition from a
state where setgroups(2) is allowed to a state where
setgroups(2) is denied; a process can only trabsition
from setgroups(2) being disallowed to setgroups(2) being
allowed.

The default value of this file in the initial user
namespace is "allow".

Once /proc/[pid]/gid_map has been written to (which has
the effect of enabling setgroups(2) in the user namesâ
pace), it is no longer possible to deny setgroups(2) by
writing to /proc/[pid]/setgroups.

A child user namespace inherits the /proc/[pid]/gid_map
setting from its parent.

If the setgroups file has the value "deny", then the
setgroups(2) system call can't subsequently be reenabled
(by writing "allow" to the file) in this user namespace.
This restriction also propagates down to all child user
namespaces of this user namespace.

Cheers,

Michael



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/