Re: process checkpointing

carguin@iname.com
Wed, 16 Dec 1998 10:08:00 -0500 (EST)


On Wed, 16 Dec 1998, Harvey J. Stein wrote:

> Oren Laadan <orenl@cs.huji.ac.il> writes:
> >
> > Make sure you don't reinvent the wheel...

Isn't that what college is all about? I'm fairly certain I saw that
somewhere. I've heard rumors that one Finnish student actually reinvented
all of Unix!

It's an assignment, so I'll be doing it one way or another. I chose it
because I couldn't see anybody doing anything similar.

> > Also, checkout CoCheck project and Condor at:

Both of these suffer from a lack of source code :)

> > BTW, checkpointing of a single process (unlike a bunch of collaborating
> > processes) can be evidently done (see above) entirely in user-space.
>
> It can be, but there's a substantial problem with mmap. You have to
> make sure that things that are mmapped get remapped to the same place
> when the process is restarted. The condor stuff does this by dumping
> all mmapped segments for the checkpoint. These are then mapped back
> in from the data dump when the process is restarted. Mmaps can occur
> from malloc (depending on the implementation) but the bigger headache
> is that shared libs are mmapped in, meaning that a) all shared libs
> used by the process are dumped, and b) after restart, the process is
> effectively statically linked since it's now using its own copies of
> the shared libs.

My current approach is to read /proc/?/maps... If I have read but not
write access to a page, I dump the device/offset/inode information into a
table to be reloaded later. If I have write access to that page, then I
dump the page into a table. This seemed like the best approach. It keeps
me from dumping libc (or whatever) every time.

> It'd be nice if you could remap the original libs instead of dumping
> them. That could substantially speed up dumping and restarting, and
> improve system usage by the restarted binaries, especially if there
> are lots of them. The Condor people argued against this, but their
> argument only applies when you want to migrate checkpointed apps
> across machines (systems might have different versions of the libs or
> might have them in different locations, etc)_. They don't really
> apply if you only want to checkpoint and later restart on the same
> machine.

Which is the exact approach I am taking. I may convert the device/inode
numbers into a filename, so that I won't be prone to problems if a library
moves. But I am requiring that the program be run on the same, or at least
a very similar, machine. I don't think I am dependent on the ELF format,
but I also won't guarantee that! This first attempt is pretty crude.

What's giving me hangups now is anonymous mapping. Currently, when I dump
a page I check if it's device numbers are zero, it's inode is zero,
and it's entire contents is zero. If so, then I know it is (or could be)
an anonymous map. But what about those pages that have already been copied
and written to? In theory I should be able to map them out of the
checkpoint file, but that means /proc/?/maps will be different than it was
before check-pointing, and so something might break. I'm not really sure
how to remap those memory regions.

The process can definitely be checkpointed from userspace. Loading it back
up requires some additional efforts though... I think, for the sake of
time, I will be devising my own binary format. Having my own binary format
handler should allow me to mmap the libraries and data exactly where I
want them, and keep me from polluting the kernel proper. Also, it should
make it easier (I hope) to port back to 2.0.x. Since the assignment
requires I modify 2.0.27, not 2.1.131, I'll have to do that at some point
:)

--
Chris Arguin                 | "...All we had were Zeros and Ones -- And 
CArguin@iname.com            |  sometimes we didn't even have Ones."
                             +--------------+	- Dilbert, by Scott Adams
http://leonardo.sr.unh.edu/arguin/home.html |

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/