Re: [PATCH 1/1][V5] Add reboot_pid_ns to handle the reboot syscall

From: Daniel Lezcano
Date: Fri Feb 03 2012 - 03:59:40 EST


On 02/03/2012 01:10 AM, Andrew Morton wrote:
On Thu, 5 Jan 2012 10:06:50 +0100
Daniel Lezcano<daniel.lezcano@xxxxxxx> wrote:

In the case of a child pid namespace, rebooting the system does not
really makes sense. When the pid namespace is used in conjunction
with the other namespaces in order to create a linux container, the
reboot syscall leads to some problems.

A container can reboot the host. That can be fixed by dropping
the sys_reboot capability but we are unable to correctly to poweroff/
halt/reboot a container and the container stays stuck at the shutdown
time with the container's init process waiting indefinitively.

After several attempts, no solution from userspace was found to reliabily
handle the shutdown from a container.

This patch propose to make the init process of the child pid namespace to
exit with a signal status set to : SIGINT if the child pid namespace called
"halt/poweroff" and SIGHUP if the child pid namespace called "reboot".
When the reboot syscall is called and we are not in the initial
pid namespace, we kill the pid namespace for "HALT", "POWEROFF", "RESTART",
and "RESTART2". Otherwise we return EINVAL.

Returning EINVAL is also an easy way to check if this feature is supported
by the kernel when invoking another 'reboot' option like CAD.

By this way the parent process of the child pid namespace knows if
it rebooted or not and can take the right decision.
Looks OK, although the comments need help. Is the below still true?

Yes, thanks for fixing this.


Do you think it would be feasible to put your testcase into
tools/testing/selftests? I'm thinking "no", because running the test
needs elevated permissions and might reboot the user's machine(!).

Yes, right. I don't think the user will be happy with that. Unfortunately, I don't see how to test this feature without falling into a reboot on failure. On the other side, this very specific feature is used in the container environment and if it fails that will be spotted immediately and fixed. So I don't think that does make sense to add this test in tools/testing/selftests.

[ ... ]

gid_t pid_gid;
int hide_pid;
+ int reboot;
};
This was particuarly distressing. The field was poorly named and other
people forgotting to document their data structures doesn't mean that
we should continue to do this!

Thanks again for adding the description. I will take care next time to add a simple description when the field name is not self-explicit or ambiguous.

-- Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/