Re: [RESEND RFC] translate_pid API

From: Nagarathnam Muthusamy
Date: Tue Mar 13 2018 - 19:57:30 EST




On 03/13/2018 04:10 PM, Jann Horn wrote:
On Tue, Mar 13, 2018 at 3:45 PM, Nagarathnam Muthusamy
<nagarathnam.muthusamy@xxxxxxxxxx> wrote:
On 03/13/2018 03:00 PM, Jann Horn wrote:
On Tue, Mar 13, 2018 at 2:44 PM, Nagarathnam Muthusamy
<nagarathnam.muthusamy@xxxxxxxxxx> wrote:
On 03/13/2018 02:28 PM, Jann Horn wrote:
On Tue, Mar 13, 2018 at 2:20 PM, Nagarathnam Muthusamy
<nagarathnam.muthusamy@xxxxxxxxxx> wrote:
On 03/13/2018 01:47 PM, Jann Horn wrote:
On Mon, Mar 12, 2018 at 10:18 AM, <nagarathnam.muthusamy@xxxxxxxxxx>
wrote:
[...]
+ */
+SYSCALL_DEFINE3(translate_pid, pid_t, pid, u64, source,
+ u64, target)
+{
+ struct pid_namespace *source_ns = NULL, *target_ns = NULL;
+ struct pid *struct_pid;
+ struct pid_namespace *ph;
+ struct hlist_bl_head *shead = NULL;
+ struct hlist_bl_head *thead = NULL;
+ struct hlist_bl_node *dup_node;
+ pid_t result;
+
+ if (!source) {
+ source_ns = &init_pid_ns;
+ } else {
+ shead = pid_ns_hash_head(pid_ns_hash, source);
+ hlist_bl_lock(shead);
+ hlist_bl_for_each_entry(ph, dup_node, shead, node) {
+ if (source == ph->ns.ns_id) {
+ source_ns = ph;
+ break;
+ }
+ }
+ if (!source_ns) {
+ hlist_bl_unlock(shead);
+ return -EINVAL;
+ }
+ }
+ if (!ptrace_may_access(source_ns->child_reaper,
+ PTRACE_MODE_READ_FSCREDS)) {
AFAICS this proposal breaks the visibility restrictions that
namespaces normally create. If there are two namespaces-based
containers that use the same UID range, I don't think they should be
able to learn information about each other, such as which PIDs are in
use in the other container; but as far as I can tell, your proposal
makes it possible to do that (unless an LSM or so is interfering). I
would prefer it if this API required visibility of the targeted PID
namespaces in the caller's PID namespace.

I am trying to simulate the same access restrictions allowed
on a process's /proc/<pid>/ns/pid file. If the translator has
access to /proc/<pid>/ns/pid file of both source and destination
namespaces, shouldn't it be allowed to translate the pid between
them?
But the translator doesn't actually need to have access to those
procfs files, right?
I thought it should have access to those procfs files to satisfy the
visibility constraint that targeted PID namespaces should be visible
in caller's PID namespace and ptrace_may_access checks that
constraint.
If there are two containers that use the same UID range,
ptrace_may_access() checks from a process in one container on a
process in another container can pass. Normally, you just can't even
reach the ptrace_may_access() checks because you can't reference
processes in another container in any way.

If there is no way to reference the process in another container,
there is no way to get to the /proc/<pid>/ns/pidns_id file which
exports the ID of that container right? So, a translator has to
first guess the container ID then try translate. Even after translation,
unless the translator has proper privileges, I believe it cant do
anything with just the pid right?
Well, yes to both. You'd have to guess the ID of the container, and
you wouldn't be able to do much with it, apart from finding valid PIDs
and their mapping between namespaces.

By the way, a related concern: The use of global identifiers will
probably also negatively affect Checkpoint/Restore In Userspace?
Will look into this. Can you point me to the specifics of the
usecase which could be negatively affected?
AFAICS you won't be able to reliably recreate namespace IDs when a
process is checkpointed and resumed, meaning that checkpoint/resume
won't work on processes that use these namespace IDs.
I agree. When the process is resumed, the namespace IDs
might be obsolete.

Thanks,
Nagarathnam.