Re: [RFC][PATCH v2] procfs: Always expose /proc/<pid>/map_files/ and make it readable

From: Austin S Hemmelgarn
Date: Mon Feb 02 2015 - 09:01:36 EST


On 2015-01-30 20:58, Calvin Owens wrote:
On Thursday 01/29 at 17:30 -0800, Kees Cook wrote:
On Tue, Jan 27, 2015 at 8:38 PM, Calvin Owens <calvinowens@xxxxxx> wrote:
On Monday 01/26 at 15:43 -0800, Andrew Morton wrote:
On Tue, 27 Jan 2015 00:00:54 +0300 Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:

On Mon, Jan 26, 2015 at 02:47:31PM +0200, Kirill A. Shutemov wrote:
On Fri, Jan 23, 2015 at 07:15:44PM -0800, Calvin Owens wrote:
Currently, /proc/<pid>/map_files/ is restricted to CAP_SYS_ADMIN, and
is only exposed if CONFIG_CHECKPOINT_RESTORE is set. This interface
is very useful for enumerating the files mapped into a process when
the more verbose information in /proc/<pid>/maps is not needed.

This is the main (actually only) justification for the patch, and it it
far too thin. What does "not needed" mean. Why can't people just use
/proc/pid/maps?

The biggest difference is that if you do something like this:

fd = open("/stuff", O_BLAH);
map = mmap(NULL, 4096, PROT_BLAH, MAP_SHARED, fd, 0);
close(fd);
unlink("/stuff");

...then map_files/ gives you a way to get a file descriptor for
"/stuff", which you couldn't do with /proc/pid/maps.

It's also something of a win if you just want to see what is mapped at a
specific address, since you can just readlink() the symlink for the
address range you care about and it will go grab the appropriate VMA and
give you the answer. /proc/pid/maps requires walking the VMA tree, which
is quite expensive for processes with many thousands of threads, even
without the O(N^2) issue.

(You have to know what address range you want though, since readdir() on
map_files/ obviously has to walk the VMA tree just like /proc/N/maps.)

This patch moves the folder out from behind CHECKPOINT_RESTORE, and
removes the CAP_SYS_ADMIN restrictions. Following the links requires
the ability to ptrace the process in question, so this doesn't allow
an attacker to do anything they couldn't already do before.

Signed-off-by: Calvin Owens <calvinowens@xxxxxx>

Cc +linux-api@

Looks good to me, thanks! Though I would really appreciate if someone
from security camp take a look as well.

hm, who's that. Kees comes to mind.

And reviewers' task would be a heck of a lot easier if they knew what
/proc/pid/map_files actually does. This:

akpm3:/usr/src/25> grep -r map_files Documentation
akpm3:/usr/src/25>

does not help.

The 640708a2cff7f81 changelog says:

: This one behaves similarly to the /proc/<pid>/fd/ one - it contains
: symlinks one for each mapping with file, the name of a symlink is
: "vma->vm_start-vma->vm_end", the target is the file. Opening a symlink
: results in a file that point exactly to the same inode as them vma's one.
:
: For example the ls -l of some arbitrary /proc/<pid>/map_files/
:
: | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80403000-7f8f80404000 -> /lib64/libc-2.5.so
: | lr-x------ 1 root root 64 Aug 26 06:40 7f8f8061e000-7f8f80620000 -> /lib64/libselinux.so.1
: | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80826000-7f8f80827000 -> /lib64/libacl.so.1.1.0
: | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a2f000-7f8f80a30000 -> /lib64/librt-2.5.so
: | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a30000-7f8f80a4c000 -> /lib64/ld-2.5.so

afacit this info is also available in /proc/pid/maps, so things
shouldn't get worse if the /proc/pid/map_files permissions are at least
as restrictive as the /proc/pid/maps permissions. Is that the case?
(Please add to changelog).

Yes, the only difference is that you can follow the link as per above.
I'll resend with a new message explaining that and the deletion thing.

There's one other problem here: we're assuming that the map_files
implementation doesn't have bugs. If it does have bugs then relaxing
permissions like this will create new vulnerabilities. And the
map_files implementation is surprisingly complex. Is it bug-free?

While I was messing with it I used it a good bit and didn't see any
issues, although I didn't actively try to fuzz it or anything. I'd be
happy to write something to test hammering it in weird ways if you like.
I'm also happy to write testcases for namespaces.

So far as security issues, as others have pointed out you can't follow
the links unless you can ptrace the process in question, which seems
like a pretty solid guarantee. As Cyrill pointed out in the discussion
about the documentation, that's the same protection as /proc/N/fd/*, and
those links function in the same way.

My concern here is that fd/* are connected as streams, and while that
has a certain level of badness as an external-to-the-process attacker,
PTRACE_MODE_READ is much weaker than PTRACE_MODE_ATTACH (which is
required for access to /proc/N/mem). Since these fds are the things
mapped into memory on a process, writing to them is a subset of access
to /proc/N/mem, and I don't feel that PTRACE_MODE_READ is sufficient.

If you haven't done close() on a mmapped file, doesn't fd/* allow the
same access to the corresponding regions of memory? Or am I missing
something?

But that said, I can't think of any reason making it MODE_ATTACH would
be a problem. Would you rather that be enforced on follow_link() like
the original patch did, or enforce it for the whole directory?

Whole directory would probably be better, as even just the mapped ranges could be considered sensitive information. Ideally, the check should be done on both follow_link(), and the directory itself.


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature