Re: [PATCH 3/3] pids: Make it possible to clone tasks with givenpids

From: Oleg Nesterov
Date: Fri Nov 11 2011 - 11:44:14 EST


On 11/11, Pavel Emelyanov wrote:
>
> On 11/11/2011 07:25 PM, Oleg Nesterov wrote:
> >
> > But. Let me repeat the question, what if you do the same with
> > pids[0] = 2 /* anything != 1 */ ? In this case we create the new
> > pid_ns, but its ->child_reaper is NULL. Unless I missed something.
>
> Hm... You're right here. I've missed the fact, then in recent kernels
> child_reaper is set under pid == 1 condition (was clone_flags & CLONE_NEWPID).

Yes, I always hated the "cleanup" which removed CLONE_NEWPID from
copy_process. This is_child_reaper() simply hides CLONE_NEWPID from
grep.

But this is offtopic. We should not create ->child_reaper with pid_nr != 1.

> How about if I fix it by disabling the simultaneous use of CLONE_NEWPID and
> CLONE_CHILD_USEPIDS and checking for last_pid != 1 in the set_pidmap?

I think this should work...

> > Hmm. It seems, we can make a simpler patch to achieve the (roughly)
> > same effect. Without touching copy_process/alloc_pid paths. What if
> > we simply add PR_SET_LAST_PID? (or something else).
> >
> > In this case the new init (created normally) read the pids from image
> > file and does prcrl(PR_SET_LAST_PID, pid-1) before the next fork.
> >
> > What do you think?
>
> This will make it impossible to fork() children on restore in parallel. And
> I don't want to lose this ability :(

Yes, this is true. You need some form of synchronization in user-space.
But, otoh, prctl/sysctl/whatever is much simpler. Both from implementation
pov and from understanding/using. You can even do, say, pthread_create()
to make a thread with the desired tid. And of course I like the fact we
do not add the new hacks into copy_process's paths.

And. If you want to restore the process tree, then these new children
have to cooperate anyway. Say, nobody can clone() without
CLONE_CHILD_USEPIDS before we restore all pids.

Yes, sysctl+clone should be "atomic", but that is all. Does it really
hurt? OK, if nothing else, can't you do somthing like

int fork_with_pid(int pid)
{
int ret;
int pipefd[2];

pipe(pipefd);

retry:
prcrl(PR_SET_LAST_PID, pid-1);
ret = fork();

if (ret == 0) {
/* child, wait from parent's ACK */
read(pipefd[0], 1, NULL);
return 0;
}

/* raced with another user of PR_SET_LAST_PID */
if (unlikely(ret != pid) {
kill(ret, SIGKILL);
waitpid(ret);
goto retry;
}

close(pipefd[1]);
return pid;
}

?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/