Re: [RFC] Add option to mount only a pids subset

From: Al Viro
Date: Mon Mar 13 2017 - 09:27:55 EST


On Sun, Mar 12, 2017 at 08:19:33PM -0700, Andy Lutomirski wrote:
> On Sat, Mar 11, 2017 at 6:13 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > PS: AFAICS, simple mount --bind of your pid-only mount will suddenly
> > expose the full thing. And as for the lifetimes making no sense...
> > note that you are simply not freeing these structures of yours.
> > Try to handle that and you'll get a serious PITA all over the
> > place.
> >
> > What are you trying to achieve, anyway? Why not add a second vfsmount
> > pointer per pid_namespace and make it initialized on demand, at the
> > first attempt of no-pid mount? Just have a separate no-pid instance
> > created for those namespaces where it had been asked for, with
> > separate superblock and dentry tree not containing anything other
> > that pid-only parts + self + thread-self...
>
> Can't we just make procfs work like most other filesystems and have
> each mount have its own superblock? If we need to do something funky
> to stat() output to keep existing userspace working, I think that's
> okay.

First of all, most of the filesystems do *NOT* guarantee anything of
that sort. And what's the point of having more instances than
necessary, anyway?

> As far as I can tell, proc_mnt is very nearly useless -- it seems to
> be used for proc_flush_task (which claims to be purely an optimization
> and could be preserved in the common case where there's only one
> relevant mount) and for sysctl_binary. For the latter, we could
> create proc_mnt but make actual user-initiated mounts be new
> superblocks anyway.

Again, what for? It won't salvage that kludge... It's not as if it
had been hard to have separate pid-only instance created when asked
for (and reused every time when we are asked for pid-only). What's
the point of ever having more than two instances per pidns? IDGI...

Folks, there is no one-to-one correspondence between mountpoints and
superblocks. Not since 2000 or so. Just don't try to shove your
per-superblock stuff into vfsmount; it simply won't work. If you
want a separate instance for that thing, then just go ahead and
have ->mount() decide which one to use (and whether to create a new
one). All there is to it...