Re: [RFC][PATCH] procfs: Always expose /proc/<pid>/map_files/ and make it readable

From: Kirill A. Shutemov
Date: Thu Jan 22 2015 - 16:28:57 EST


On Thu, Jan 22, 2015 at 01:00:25PM -0800, Calvin Owens wrote:
> On Thursday 01/22 at 13:02 +0200, Kirill A. Shutemov wrote:
> > On Wed, Jan 21, 2015 at 06:45:54PM -0800, Calvin Owens wrote:
> > > Currently, /proc/<pid>/map_files/ is restricted to CAP_SYS_ADMIN, and
> > > is only exposed if CONFIG_CHECKPOINT_RESTORE is set. This interface
> > > is very useful for enumerating the files mapped into a process when
> > > the more verbose information in /proc/<pid>/maps is not needed.
> > >
> > > This patch moves the folder out from behind CHECKPOINT_RESTORE, and
> > > removes the CAP_SYS_ADMIN restrictions. To avoid exposing files to
> > > processes for whom they may not be visible, a follow_link() stub is
> > > added to the inode_operations struct attached to the symlinks that
> > > prevent them from being followed without CAP_SYS_ADMIN.
> > >
> > > Signed-off-by: Calvin Owens <calvinowens@xxxxxx>
> > > ---
> > > fs/proc/base.c | 42 +++++++++++++++++++++++-------------------
> > > 1 file changed, 23 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > > index 3f3d7ae..7d48003 100644
> > > --- a/fs/proc/base.c
> > > +++ b/fs/proc/base.c
> > > @@ -1632,8 +1632,6 @@ end_instantiate:
> > > return dir_emit(ctx, name, len, 1, DT_UNKNOWN);
> > > }
> > >
> > > -#ifdef CONFIG_CHECKPOINT_RESTORE
> > > -
> > > /*
> > > * dname_to_vma_addr - maps a dentry name into two unsigned longs
> > > * which represent vma start and end addresses.
> > > @@ -1660,11 +1658,6 @@ static int map_files_d_revalidate(struct dentry *dentry, unsigned int flags)
> > > if (flags & LOOKUP_RCU)
> > > return -ECHILD;
> > >
> > > - if (!capable(CAP_SYS_ADMIN)) {
> > > - status = -EPERM;
> > > - goto out_notask;
> > > - }
> > > -
> > > inode = dentry->d_inode;
> > > task = get_proc_task(inode);
> > > if (!task)
> > > @@ -1753,6 +1746,28 @@ struct map_files_info {
> > > unsigned char name[4*sizeof(long)+2]; /* max: %lx-%lx\0 */
> > > };
> > >
> > > +/*
> > > + * Allowing any user to follow the symlinks in /proc/<pid>/map_files/ could
> > > + * allow processes to access files that should not be visible to them, so
> > > + * restrict follow_link() to CAP_SYS_ADMIN for these files.
> > > + */
> > > +static void *proc_map_files_follow_link(struct dentry *d, struct nameidata *n)
> > > +{
> > > + if (!capable(CAP_SYS_ADMIN))
> > > + return ERR_PTR(-EPERM);
> > > +
> > > + return proc_pid_follow_link(d, n);
> > > +}
> >
> > I have thought a bit more about this and not sure it's reasonable to
> > limit it to CAP_SYS_ADMIN. What scenario are we protecting from?
> >
> > Initially, I thought about something like this: privileged process opens a
> > file, map part of it, closes the file and drop privileges with hope to
> > limit further access to mapped window of the file. But I don't see what
> > would stop the unprivileged process from accessing rest of the file using
> > mremap(2). And if a process can do this, anybody who can ptrace(2) the
> > process can do this.
> >
> > Am I missing something?
>
> The specific case I was thinking of is a process in a chroot with a
> mounted /proc inside of it: if a process inside the chroot has the same
> UID as a process outside of it, the chroot'ed process could follow the
> symlinks in map_files/ and poke files it can't actually see, right?

It depends on how you define "poke". If you mean touch content of the
file, then, well, you can do it now. You cannot do anything which requires
file descriptor -- open(), ftrancate(), etc.

--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/