/proc/$PID/sched does not take PID namespace into account

From: Emmanuel Deloget
Date: Tue Sep 03 2013 - 10:31:58 EST


Hello,

(please CC me when answering this mail ; and sorry for my borken
English).

I noticed that when a process is executed in a PID namespace it can
still find his real PID by parsing /proc/$PID_IN_NS/sched.

(in this example I created a new PID namespace and ran /bin/bash at
PID 1)

# uname -a
Linux my-thinkpad 3.9-1-amd64 #1 SMP Debian 3.9.6-1 x86_64 GNU/Linux
#
# pidof bash
1
# head -n 1 /proc/1/sched
bash (14957, #threads: 1)
# cat /proc/1/stat
1 (bash) S 0 1 0 34823 220 4202752 11359 (...)

In the root PID namespace

# pidof bash
(...) 14957 (...)
# head -n 1 /proc/14957/sched
bash (14957, #threads: 1)
# cat /proc/14957/stat
14957 (bash) S 14956 14957 23465 34823 14957 4202752 11386 (...)

The principle of least surprise tells me that if my PID is n in a
particular PID namespace then every stat or debug info should tell
me that my PID is n (i.e. I should not be able to get the real PID
of my program). This is a consistency issue: if I get different
information from different sources I may not be able to tell which
one is the right one.

The issue (if this is really an issue) lies in kernel/sched/debug.c,
function proc_sched_show_task(). The code says [1]:

SEQ_printf(m, "%s (%d, #threads: %d)\n", p->comm, p->pid,
get_nr_threads(p));

I understand that you're not supposed to use the content of a /proc
file that has been generated by a kernel source file named "debug.c"
in a production environment yet several distributions out there seems
to enable this by default (at least the Debian distribution I'm using
does it).

I see a few options:

* either it's a bug and it should be corrected (I'm not sure how to
do it; the printed PID should reflect the current PID namespace
and I don't how how to get this information).

* or it has been decided on purpose (i.e. it's not a bug);
/proc/$PID/sched then offers a reliable way to get the real PID
of any process, including processes that run in a child PID
namespace.

* or it's a bug but it cannot be corrected (kernel ABI ; it has been
here for a looooong time - possibly since the PID namespace
integration).

(There might be other options here).

In the two last cases there is no need for a patch but I'd be happy
if someone explains the reasoning.

Best regards,

-- Emmanuel Deloget

[1]: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/debug.c?id=refs/tags/v3.11#n495
[2]: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/debug.c?id=refs/tags/v3.11#n118


begin:vcard
fn:Emmanuel Deloget
n:Deloget;Emmanuel
org:efixo / SFR;DATA
adr;quoted-printable:;;67 mont=C3=A9e de St Menet;MARSEILLE;;13011;FRANCE
email;internet:emmanuel.deloget@xxxxxxxxx
title:Team Leader
tel;work:04 88 15 50 77
url:www.sfr.fr
version:2.1
end:vcard