Re: pid ns feature request

From: Eric W. Biederman
Date: Fri Apr 25 2014 - 16:26:10 EST


Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes:

> On Fri, Apr 25, 2014 at 12:37 PM, Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
>> Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes:
>>
>>> Unless I'm missing some trick, it's currently rather painful to mount
>>> a namespace /proc. You have to actually be in the pid namespace to
>>> mount the correct /proc instance, and you can't unmount the old /proc
>>> until you've mounted the new /proc. This means that you have to fork
>>> into the new pid namespace before you can finish setting it up.
>>
>> Yes. You have to be inside just about all namespaces before you can
>> finish setting them up.
>>
>> I don't know the context in which needed to be inside the pid namespace
>> is a burden.
>
> I'm trying to sandbox myself. I unshare everything, setup up new
> mounts, pivot_root, umount the old stuff, fork, and wait around for
> the child to finish.
>
> This doesn't work: the parent can't mount the new /proc, and the child
> can't either because it's too late.
>
> The only solution I can think of without kernel changes is to fork the
> child (pid 1) before pivot_root, which makes everything more
> complicated. I suppose I can unshare, fork immediately, have the
> child set up all the mounts, and then wake the parent, but this is an
> annoying bit of extra complexity for no obvious gain.

Or perhaps just use clone and clone flags.

What are you doing with the parent process? What value does it serve?

>>> Would it make sense to add a mount option to procfs to request a mount
>>> for pid_ns_for_children instead of task_active_pid_ns?
>>
>> This is about the using setns and unshare?
>>
>> Adding a proc amount option that takes a pid namespace file descriptor
>> would be the general solution, and might be worth implementing.
>>
>> Getting a pid namespace file descriptors when there are no pids might be
>> a challenge.
>
> Indeed, hence my request for a specific mode to mount /proc for
> pid_ns_for_children.
>
> FWIW, I also tried forking, having the child mount /proc and exit,
> then forking again later on. That also doesn't work -- it looks like
> you can't recreate pid 1 after it does.

Nope. Once pid 1 (init) is dead the pid namespace is dead.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/