Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace

From: Herbert Poetzl
Date: Mon Feb 20 2006 - 12:02:51 EST


On Mon, Feb 20, 2006 at 12:27:03PM +0300, Kirill Korotaev wrote:
>>>> I also don't understand why you are eager to introduce new sys
>>>> calls like pkill(path_to_process), but is trying to use waitpid()
>>>> for pspace die notifications? Maybe it is simply better to
>>>> introduce container specific syscalls which should be clean and
>>>> tidy, instead of messing things up with clone()/waitpid() and so
>>>> on? The more simple code we have, the better it is for all of us.

>>> now that you mention it, maybe we should have a few
>>> rounds how those 'generic' container syscalls would
>>> look like?

>> I still like the following:
>>
>> clone(): extended with flags for asking a private copy of various
>> namespaces. For the CLONE_NEWPIDSPACE flag, the pid which
>> is returned to the parent process is it's handle to the
>> new pidspace.

> - clone has limited number of flags.
> - sooner or later you will need to add flags about what and how
> you want to close (e.g. pspace with weak or strong isolation
> and so on)

I still do not see a need to do that at clone() time
but I assume you have your reasons for postulating this ...

>> sys_execpid(pid, argv, envp): exec a new syscall with the requested
>> pid, if the pid is available. Else either return an error,
>> or pick a random pid. Error makes sense to me, but that's
>> probably debatable.
> the problem is that in real life environments where executables can be
> substitutes this is kind of a security issue. Also I really hate the
> idea of using exec() for changing something.

> In OpenVZ we successfully do context changes without exec()'s.

here I agree, changes between *spaces should be independant
from exex(), but IMHO there is no need to reuse existing
interfaces for that, a syscall will do fine here ...

>> sys_fork_slide(pid): fork and slide into the pidspace belong to pid.
>> This way we can do things like
>>
>> p = sys_fork_slide(2794);
>> if (p == 0) {
>> kill(2974);
>> } else {
>> waitpid(p, 0, 0);
>> }
>> Ok, this last one in particular needs to be improved, but these two
>> syscalls and the clone flags together give us all we need. Right?
> Again, you concentrate on PIDspaces forgeting about all the other
> namespaces.
>
> I would prefer:
>
> - sys_ns_create()
> creates namespace and makes a task to inherit this namespace.
> If _needed_, it _can_ fork inside.

I don't see why not have both, the auto-created
*space on clone() and the ability to create empty
*spaces which can be populated and/or entered

> - sys_ns_inherit()
> change active namespace.

hmm, sounds like a misnomer to me ...

> But how should we reference namespace? by globabl ID?

definitely by some system-unique identifier ...

best,
Herbert

> Kirill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/