Re: [GIT PULL] Namespace file descriptors for 2.6.40

From: C Anthony Risinger
Date: Wed May 25 2011 - 17:55:27 EST


On Wed, May 25, 2011 at 4:38 PM, Serge E. Hallyn <serge@xxxxxxxxxx> wrote:
> Quoting C Anthony Risinger (anthony@xxxxxxx):
>> On Mon, May 23, 2011 at 4:05 PM, Eric W. Biederman
>> <ebiederm@xxxxxxxxxxxx> wrote:
>> >
>> > This tree adds the files /proc/<pid>/ns/net, /proc/<pid>/ns/ipc,
>> > /proc/<pid>/ns/uts that can be opened to refer to the namespaces of a
>> > process at the time those files are opened, and can be bind mounted to
>> > keep the specified namespace alive without a process.
>> >
>> > This tree adds the setns system call that can be used to change the
>> > specified namespace of a process to the namespace specified by a system
>> > call.
>>
>> i just have a quick question regarding these, apologies if wrong place
>> to respond -- i trimmed to lists only.
>>
>> if i understand correctly, mount namespaces (for example), allow one
>> to build such constructs as "private /tmp" and similar that even
>> `root` cannot access ... and there are many reasons `root` does not
>> deserve to completely know/interact with user processes (FUSE makes a
>> good example ... just because i [user] have SSH access to a machine,
>> why should `root`?)
>>
>> would these /proc additions break such guarantees? ÂIOW, would it now
>> become possible for `root` to inject stuff into my private namespaces,
>> and/or has these guarantees never existed and i am mistaken? Âis there
>> any kind of ACL mechanism that endows the origin process (or similar)
>> with the ability to dictate who can hold and/or interact with these
>> references?
>
> If for instance you have a file open in your private /tmp, then root
> in another mounts ns can open the file through /proc/$$/fd/N anyway.
> If it's a directory, he can now traverse the whole fs.

aaah right :-( ... there's always another way isn't there ... curse
you Linux for being so flexible! (just kidding baby i love you)

this seems like a more fundamental issue then? or should i not expect
to be able to achieve separation like this? i ask in the context of
OS virt via cgroups + namespaces, eg. LXC et al, because i'm about to
perform a massive overhaul to our crusty sub-2.6.18 infrastructure and
i've used/followed these technologies for couple years now ... and
it's starting to feel like "the right time".

C Anthony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/