Re: [RFC][PATCH 0/3] fork: Add the ability to create tasks with givenpids

From: Konstantin Khlebnikov
Date: Sun Nov 27 2011 - 04:41:43 EST


Pavel Emelyanov wrote:
OK, here's another proposal that seem to suit all of us:

1. me wants to clone tasks with pids set
2. Pedro wants to fork task with not changing pids and w/o root perms
3. Oleg and Tejun want to have little intrusion into fork() path

The proposal is to implement the PR_RESERVE_PID prctl which allocates and puts a
pid on the current. The subsequent fork() uses this pid, this pid survives and keeps
its bit in the pidmap after detach. The 2nd fork() after the 1st task death thus
can reuse the same pid again. This basic thing doesn't require root perms at all
and safe against pid reuse problems. When requesting for pid reservation task may
specify a pid number it wants to have, but this requires root perms (CAP_SYS_ADMIN).

Pedro, I suppose this will work for your checkpoint feature in gdb, am I right?

Few comments about intrusion:

* the common path - if (pid !=&init_struct_pid) - on fork is just modified
* we have -1 argument to copy_process
* one more field on struct pid is OK, since it size doesn't change (32 bit level is
anyway not required, it's OK to reduce on down to 16 bits)
* no clone flags extension
* no new locking - the reserved pid manipulations happen under tasklist_lock and
existing common paths do not require more of it
* yes, we have +1 member on task_struct :(

Current API problems:

* Only one fork() with pid at a time. Next call to PR_RESERVE_PID will kill the
previous reservation (don't know how to fix)
* No way to fork() an init of a pid sub-namespace with desired pid in current
(can be fixed for a flag for PR_RESERVE_PID saying that we need a pid for a
namespace of a next level)

* No way to grab existing pid for reserve (can be fixed, if someone wants this)

We can add flag to sys_wait4(), and stash pid in wait_task_zombie(), right before release_task()
code will looks something like this:

- if (p != NULL)
+ if (p != NULL) {
+ if ((wo->wo_flags & WCATCHPID) && !current->pid_stash) {
+ struct pid *pid = task_pid(p);
+
+ pid->flags |= PID_STASHED;
+ current->pid_stash = get_pid(pid);
+ }
release_task(p);
+ }

And next fork() creates child with the same pid.
So, struct pid will work like boomerang =)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/