Re: [RFC][v7][PATCH 0/9] Implement clone2() system call

From: Oren Laadan
Date: Thu Sep 24 2009 - 13:44:19 EST




Sukadev Bhattiprolu wrote:
> === NEW CLONE() SYSTEM CALL:
>
> To support application checkpoint/restart, a task must have the same pid it
> had when it was checkpointed. When containers are nested, the tasks within
> the containers exist in multiple pid namespaces and hence have multiple pids
> to specify during restart.
>
> This patchset implements a new system call, clone2() that lets a process
> specify the pids of the child process.
>
> Patches 1 through 6 are helper patches, needed for choosing a pid for the
> child process.
>
> Patch 8 defines a prototype of the new system call. Patch 9 adds some
> documentation on the new system call, some/all of which will eventually
> go into a man page.
>

[...]

>
> Based on these requirements and constraints, we explored a couple of system
> call interfaces (in earlier versions of this patchset) and currently define
> the system call as:
>
> struct clone_struct {
> u64 flags;
> u64 child_stack;
> u32 nr_pids;
> u32 parent_tid;
> u32 child_tid;

So @parent_tid and @child_tid are pointers to userspace memory and
require 'u64' (and it won't hurt to make @reserved1 a 'u64' as well).

> u32 reserved1;
> u64 reserved2;
> };
>

Also, for forward/backward compatibility, explicitly state in the
documentation, and enforce in the kernel, that flags which are not
defined must not be set, and that reserved{1,2} must remain 0.

> sys_clone2(struct clone_struct __user *cs, pid_t __user *pids)
>
> Signed-off-by: Sukadev Bhattiprolu <sukadev@xxxxxxxxxxxxxxxxxx>

Otherwise, looks great.

Oren.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/