Re: [RFC PATCH 3/8] kmod - teach call_usermodehelper() to use a namespace

From: Ian Kent
Date: Mon Feb 09 2015 - 19:08:44 EST


On Mon, 2015-02-09 at 17:03 +0100, Oleg Nesterov wrote:
> On 02/09, Ian Kent wrote:
> >
> > On Sun, 2015-02-08 at 20:00 +0100, Oleg Nesterov wrote:
> > > > +
> > > > + this = file_open_root(mnt->mnt_root, mnt, path, O_RDONLY);
> > > > + if (unlikely(IS_ERR(this))) {
> > > > + err = PTR_ERR(this);
> > > > + break;
> > > > + }
> > > > +
> > > > + err = setns_inode(file_inode(this), 0);
> > > > + fput(this);
> > > > + if (err)
> > > > + break;
> > > > + }
> > > > +
> > > > + return err;
> > > > +}
> > >
> > > Yes, I need to actually read this series and setns paths, but at first glance
> > > there must be a simpler method to call ops->install's and switch_task_namespaces.
> >
> > Yes, the namespaces implementation does seem a bit strange in this
> > respect. I mentioned that concern the first time I posted these. But I'm
> > still not that clear on the big picture of how namespace are meant to
> > work.
> >
> > It's not just access to ops->install() that's the problem.
> >
> > For each of the individual namespaces we open a file handle, to get
> > access to ops->install() for that namespace, install it, drop "all" the
> > namespaces then replace them with the new set that essentially has one
> > namespace changed.
>
> I understand. but I still can't understand why we can't implement something
> like
> enter_ns(struct nsproxy *p)
> {
> new_nsproxy = create_new_namespaces(...);
>
> p->mnt_ns->ns->ops->install(new_nsproxy, ...);
> p->pid_ns_for_children->ns->ops->install(new_nsproxy, ...);
> ...
>
> switch_task_namespaces(new_nsproxy);
> }
>
> Why we should abuse fs/proc ?

That sounds like a much better approach.
Your saying just take a reference to the nsproxy from the located
process and use it instead, right?
Working out if there's a difference with what you from the open is
challenging (I already tried), I'll have another go at it.

>
> See also below.
>
> > > Sorry if this was already discussed before, but to me it looks a bit strange
> > > to abuse /proc/ files for this. And again, iiuc file_open_root() can fail if
> > > tsk has already exited (init can be multithreaded).
> >
> > Not sure that the failure is a problem though as long as it's handled
> > since, if the init process of the container is gone (or will be gone
> > once were done), so is the container and the caller.
>
> Not really. Individual thread can exit while the whole "init" process can be alive.
> In particular the main thread can exit and become a zombie, so find_task_by_vpid(1)
> can't work in general.
>
> You can probably use task_active_pid_ns()-child_reaper, but again I do not think
> you should pass "task_struct *" to enter_ns().

OK, I need to check this, btw, thanks for the comments.

>
>
> And. Whatever we do, ops->install() or setns_inode() can't solve the problem with
> pid_ns. You need the additional clone() to "activate" it. pidns_install() does not
> actually change task_active_pid_ns().

Right, but all this is done in preparation for the following do_execve()
call. Isn't that enough or am I missing something?

Ian

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/