Re: [PATCH] linux-cr: nested pid namespaces (v3)

From: Louis Rilling
Date: Wed Mar 24 2010 - 05:56:22 EST


On 23/03/10 8:52 -0500, Serge E. Hallyn wrote:
> Quoting Louis Rilling (Louis.Rilling@xxxxxxxxxxx):
>
> Hi Louis, thanks again for reviewing.

No problem, thanks to you for your patience.

>
> > To me the real reason is to anticipate pid namespace unsharing. And this
> > together with setns() will need to re-consider much of the namespace C/R
> > logic imho. For instance, checkpoint could be done from a foreign task
> > having entered the container, leak detection should take such foreign
> > tasks into account (see example below), etc.
>
> ...
>
> > >
> > > @@ -293,10 +295,15 @@ static int may_checkpoint_task(struct ckpt_ctx *ctx, struct task_struct *t)
> > > _ckpt_err(ctx, -EPERM, "%(T)Nested net_ns unsupported\n");
> > > ret = -EPERM;
> > > }
> > > - /* no support for >1 private pidns */
> > > - if (nsproxy->pid_ns != ctx->root_nsproxy->pid_ns) {
> > > - _ckpt_err(ctx, -EPERM, "%(T)Nested pid_ns unsupported\n");
> > > - ret = -EPERM;
> > > + /* pidns must be descendent of root_nsproxy */
> > > + pidns = nsproxy->pid_ns;
> >
> > In case of unshared pid namespace, task_active_pid_ns(t) should be checked
> > instead of t->nsproxy->pid_ns: we can't checkpoint a foreign task.
>
> Unsharing can only be done to a child ns, so it wouldn't be foreign.
> Though of course that depends on which one ends up being the original
> pid_ns (see below).

If task was created in an ancestor pid namespace of the checkpointed container,
I call it foreign and I don't think that we want to checkpoint it together with
the container.

>
> Now, regarding supporting unshared pid_ns, I think that (1) it will
> be a simple matter of separately doing
> pid_pidns = checkpoint_obj(task_active_pid_ns(task));
> nsp_pidns = checkpoint_obj(task->nsproxy->pid_ns);
> since we will need to record both.

Agreed. As long as both of those namespaces are descendant of the container's
root pid namespace.

> In addition, (2) the most
> recent emails I see on the topics are still unsure about whether
> we want to have the unshared pid_ns be reflected in
> ns_of_pid(task_pid(task)) or task->nsproxy->pid_ns, so I think
> we'll just have to handle them when they are implemented.

I did not notice any (convincing) argument in favor of changing
ns_of_pid(task_pid(task)) (aka task_active_pid_ns(task)), and I like how Eric's
proposal is simple to implement. But I agree with you that pid namespaces
handling should not be re-worked before a more definitive approach is
implemented.

Thanks,

Louis

--
Dr Louis Rilling Kerlabs
Skype: louis.rilling Batiment Germanium
Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes
http://www.kerlabs.com/ 35700 Rennes

Attachment: signature.asc
Description: Digital signature