[RFC][patch 00/21] PID Virtualization: Overview and Patches

From: Hubertus Franke
Date: Thu Dec 15 2005 - 09:37:02 EST


This patchset is a followup to the posting by Serge.
http://marc.theaimsgroup.com/?l=linux-kernel&m=113200410620972&w=2

In this patchset here, we are providing the pid virtualization mentioned
in serge's posting.

> I'm part of a project implementing checkpoint/restart processes.
> After a process or group of processes is checkpointed, killed, and
> restarted, the changing of pids could confuse them. There are many
> other such issues, but we wanted to start with pids.
>
> This patchset introduces functions to access task->pid and ->tgid,
> and updates ->pid accessors to use the functions. This is in
> preparation for a subsequent patchset which will separate the kernel
> and virtualized pidspaces. This will allow us to virtualize pids
> from users' pov, so that, for instance, a checkpointed set of
> processes could be restarted with particular pids. Even though their
> kernel pids may already be in use by new processes, the checkpointed
> processes can be started in a new user pidspace with their old
> virtual pid. This also gives vserver a simpler way to fake vserver
> init processes as pid 1. Note that this does not change the kernel's
> internal idea of pids, only what users see.
>
> The first 12 patches change all locations which access ->pid and
> ->tgid to use the inlined functions. The last patch actually
> introduces task_pid() and task_tgid(), and renames ->pid and ->tgid
> to __pid and __tgid to make sure any uncaught users error out.
>
> Does something like this, presumably after much working over, seem
> mergeable?

These patches build on top of serge's posted patches (if necessary
we can repost them here).

PID Virtualization is based on the concept of a container.
The ultimate goal is to checkpoint/restart containers.

The mechanism to start a container
is to 'echo "container_name" > /proc/container' which creates a new
container and associates the calling process with it. All subsequently
forked tasks then belong to that container.
There is a separate pid space associated with each container.
Only processes/task belonging to the same container "see" each other.
The exception is an implied default system container that has
a global view.

The following patches accomplish 3 things:
1) identify the locations at the user/kernel boundary where pids and
related ids ( pgrp, sessionids, .. ) need to be (de-)virtualized and
call appropriate (de-)virtualization functions.
2) provide the virtualization implementation in these functions.
3) implement a container object and a simple /proc interface to create one
4) provide a per container /proc/fs

-- Hubertus Franke (frankeh@xxxxxxxxxxxxxx)
-- Cedric Le Goater (clg@xxxxxxxxxx)
-- Serge E Hallyn (serue@xxxxxxxxxx)
-- Dave Hansen (haveblue@xxxxxxxxxx)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/