Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart

From: Oren Laadan
Date: Wed Oct 22 2008 - 14:33:25 EST

Next message: Jelle de Jong: "Re: [Em28xx] [PATCH] em28xx patches against the latest git tree"
Previous message: Anton Vorontsov: "Re: [PATCH 4/7] gpiolib: implement dev_gpiochip_{add,remove} calls"
In reply to: Serge E. Hallyn: "Re: [RFC v7][PATCH 2/9] General infrastructure for checkpointrestart"
Next in thread: Oren Laadan: "[RFC v7][PATCH 1/9] Create syscalls: sys_checkpoint, sys_restart"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Serge E. Hallyn wrote:
> Quoting Oren Laadan (orenl@xxxxxxxxxxxxxxx):
>>
>> Serge E. Hallyn wrote:
>>> Quoting Oren Laadan (orenl@xxxxxxxxxxxxxxx):
>>> Just thinking aloud...
>>>
>>> Is read mode appropriate? The user can edit the statefile and restart
>>> it. Admittedly the restart code should then do all the appropriate
>>> checks for recreating resources, but I'm having a hard time thinking
>>> through this straight.
>>>
>>> Let's say hallyn is running passwd. ruid=500,euid=0. He quickly
>>> checkpoints. Then he restarts. Will restart say "ok, the /bin/passwd
>>> binary is setuid 0 so let hallyn take euid=0 for this?" I guess not.
>>> But are there other resources for which this is harder to get right?
>> I'd say that checkpoint and restart are separate.
>>
>> In checkpoint, you read the state and save it somewhere; you don't
>> modify anything in the target task (container). This equivalent to
>> ptrace read-mode. If you could do ptrace, you could save all that
>> state. In fact, you could save it in a format that is suitable for
>> a future restart ... (or just forge one !)
>
> Yeah, that's convincing.
>
>> In restart, we either don't trust the user and keep everything to
>> be done with her credentials, of we trust the root user and allow
>> all operations (like loading a kernel module).
>>
>> We can actually have both modes of operations. How to decide that
>> we trust the user is a separate question: one option is to have
>> both checkpoint and restart executables setuid - checkpoint will
>> sign (in user space) the output image, and restart (in user space)
>> will validate the signature, before passing it to the kenrel. Surely
>> there are other ways...
>
> Makes sense.
>
> ...
>
>>> Hmm, so do you think we just always use the caller's credentials?
>> Nope, since we will fail to restart in many cases. We will need a way
>> to move from caller's credentials to saved credentials, and even from
>> caller's credentials to privileged credentials (e.g. to reopen a file
>> that was created by a setuid program prior to dropping privileges).
>
> Can we agree to worry about that much much later? :) Would you agree

Definitely. Even more so - I believe that's a user-space issue :)

> that for the majority of use-cases, restarting with caller's credentials
> will work? Or am I wrong about that?

That depends on your target audience. For HPC you're probably right.
For server applications this may not be the case (e.g. apache needs
a privileged port, and then it drops privileges).

I agree that we may safely (...) defer this discussion until the
implementation gets much beefier.

>
>> To do that, we will need to agree on a way to escalate/change the
>> credentials. This however belongs to user-space (and then the binaries
>> for checkpoint/restart will be setuid themselves).
>
> Ok those are less scary, and I have no problem with those.
>
>> There will also be the issue of mapping credentials: a user A may have
>> one UID/GID on once system and another UID/GID on another system, and
>> we may want to do the conversion. This, too, can be done in user space
>> prior to restart by using an appropriate filter through the checkpoint
>> stream.
>
> User namespaces may help here too. So user A can create a new user
> namespace and restart as user B in that namespace. But right now that
> sounds like overkill.

Indeed, virtualization is probably the solution. Here, too, I think
it's safe to defer the discussion.

Oren.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Jelle de Jong: "Re: [Em28xx] [PATCH] em28xx patches against the latest git tree"
Previous message: Anton Vorontsov: "Re: [PATCH 4/7] gpiolib: implement dev_gpiochip_{add,remove} calls"
In reply to: Serge E. Hallyn: "Re: [RFC v7][PATCH 2/9] General infrastructure for checkpointrestart"
Next in thread: Oren Laadan: "[RFC v7][PATCH 1/9] Create syscalls: sys_checkpoint, sys_restart"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]