Re: [PATCH] Add pidfs filesystem

From: Alexey Gladkov
Date: Wed Feb 22 2017 - 06:57:23 EST


On Wed, Feb 22, 2017 at 10:40:49AM +0300, Pavel Emelyanov wrote:
> On 02/21/2017 05:57 PM, Oleg Nesterov wrote:
> > On 02/18, Alexey Gladkov wrote:
> >>
> >> This patch allows to mount only the part of /proc related to pids
> >> without rest objects. Since this is an addon to /proc, flags applied to
> >> /proc have an effect on this pidfs filesystem.
> >
> > I leave this to you and Eric, but imo it would be nice to avoid another
> > filesystem.
> >
> >> Why not implement it as another flag to /proc ?
> >>
> >> The /proc flags is stored in the pid_namespace and are global for
> >> namespace. It means that if you add a flag to hide all except the pids,
> >> then it will act on all mounted instances of /proc.
> >
> > But perhaps we can use mnt_flags? For example, lets abuse MNT_NODEV, see
> > the simple patch below. Not sure it is correct/complete, just to illustrate
> > the idea.
> >
> > With this patch you can mount proc with -onodev and it will only show
> > pids/self/thread_self:
> >
> > # mkdir /tmp/D
> > # mount -t proc -o nodev none /tmp/D
> > # ls /tmp/D
> > 1 11 13 15 17 19 20 22 24 28 3 31 33 4 56 7 9 thread-self
> > 10 12 14 16 18 2 21 23 27 29 30 32 34 5 6 8 self
> > # cat /tmp/D/meminfo
> > cat: /tmp/D/meminfo: No such file or directory
> > # ls /tmp/D/irq
> > ls: cannot open directory /tmp/D/irq: No such file or directory
> >
> > No?
>
> Yes!!! If this whole effort with pidfs and overlayfs will move forward, I would
> prefer seeing the nodev procfs version, rather than another fs.

But this is not procfs anymore. If someone will wait for procfs here it will
be disappointed :)

> As far as the overlayfs part is concerned, having an overlayfs mounted on /proc
> inside container may result in problems as applications sometimes check for /proc
> containing procfs (by checking statfs.f_type == PROC_SUPER_MAGIC or by reading
> the /proc/mounts).

It is not a replacement for procfs. It's a subset of procfs. If someone wants
the procfs in the code we should not deceive him.

No?

> -- Pavel
>
> > Oleg.
> >
> >
> > --- a/fs/proc/generic.c
> > +++ b/fs/proc/generic.c
> > @@ -305,11 +305,22 @@ int proc_readdir_de(struct proc_dir_entry *de, struct file *file,
> >
> > int proc_readdir(struct file *file, struct dir_context *ctx)
> > {
> > + int mnt_flags = file->f_path.mnt->mnt_flags;
> > struct inode *inode = file_inode(file);
> >
> > + if (mnt_flags & MNT_NODEV)
> > + return 1;
> > +
> > return proc_readdir_de(PDE(inode), file, ctx);
> > }
> >
> > +static int proc_dir_open(struct inode *inode, struct file *file)
> > +{
> > + if (file->f_path.mnt->mnt_flags & MNT_NODEV)
> > + return -ENOENT;
> > + return 0;
> > +}
> > +
> > /*
> > * These are the generic /proc directory operations. They
> > * use the in-memory "struct proc_dir_entry" tree to parse
> > @@ -319,6 +330,7 @@ static const struct file_operations proc_dir_operations = {
> > .llseek = generic_file_llseek,
> > .read = generic_read_dir,
> > .iterate_shared = proc_readdir,
> > + .open = proc_dir_open,
> > };
> >
> > /*
> > --- a/fs/proc/inode.c
> > +++ b/fs/proc/inode.c
> > @@ -318,12 +318,16 @@ proc_reg_get_unmapped_area(struct file *file, unsigned long orig_addr,
> >
> > static int proc_reg_open(struct inode *inode, struct file *file)
> > {
> > + int mnt_flags = file->f_path.mnt->mnt_flags;
> > struct proc_dir_entry *pde = PDE(inode);
> > int rv = 0;
> > int (*open)(struct inode *, struct file *);
> > int (*release)(struct inode *, struct file *);
> > struct pde_opener *pdeo;
> >
> > + if (mnt_flags & MNT_NODEV)
> > + return -ENOENT;
> > +
> > /*
> > * Ensure that
> > * 1) PDE's ->release hook will be called no matter what
> >
> > .
> >

--
Rgrds, legion