Re: [PATCH 05/10] Core checkpoint/restart support code

From: Matt Helsley
Date: Mon Apr 04 2011 - 18:36:45 EST


On Mon, Apr 04, 2011 at 04:43:29PM -0500, Nathan Lynch wrote:
> On Mon, 2011-04-04 at 13:32 -0400, Oren Laadan wrote:
> > From the technical point of view it *is* a big problem: there are
> > very good reasons why we chose a certain design.
> >
> > If Natahan is suggesting in-kernel tree creation as a temporary thing
> > to simplify the code for review - then, given that this patch handles
> > a single process, doing so add lots of unnecessary code, all of which
> > in the kernel.
> >
> > If this is the beginning of a permanent approach, then it is totally
> > incompatible with what we have done so far, and severely restricts
> > the kind of use--cases of the project, potentially making it too
> > unattractive for many natural adaptors, like HPC users. Sorry, nack.
>
> It's not a stopgap measure to "ease review" or whatever; recreating the
> task tree in-kernel is a fundamental - and simplifying - part of the
> design. I have earned through painful experience the opinion that
> recreating the task tree in userspace is pretty much insane, as is
> exposing the pid allocator to userspace via eclone(2), as is attempting
> to support c/r of any resource that isn't isolated/virtualized, as is
> having every recreated task "rendezvous" in the kernel by having them
> all call restart(2), even though little significant work can be done in
> parallel.

So far we've been proceeding under the assumption that some userspace
code ugliness was acceptable if it simplified the kernel code. With
ghost issues and the stuff you've mentioned above I think it's become
questionable whether that choice has simplified the kernel code enough
and trying something different is valuable.

At this point the only advantage I still see in userspace task creation for
restart is the reviewability of it. eclone is a small piece of code that
can be reviewed independently of restart and thus will prove alot easier to
review for correctness and security than in-kernel task creation for restart.

Cheers,
-Matt Helsley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/