Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17]

From: Ian Kent
Date: Thu Feb 27 2020 - 19:43:35 EST


On Thu, 2020-02-27 at 16:14 +0100, Karel Zak wrote:
> On Thu, Feb 27, 2020 at 02:45:27PM +0100, Miklos Szeredi wrote:
> > > So the problem I want to see fixed is the effect of very large
> > > mount tables on other user space applications, particularly the
> > > effect when a large number of mounts or umounts are performed.
>
> Yes, now you have to generate (in kernel) and parse (in
> userspace) all mount table to get information about just
> one mount table entry. This is typical for umount or systemd.
>
> > > > - add a notification mechanism - lookup a mount based on
> > > > path
> > > > - and a way to selectively query mount/superblock information
> > > based on path ...
>
> For umount-like use-cases we need mountpoint/ to mount entry
> conversion; I guess something like open(mountpoint/) + fsinfo()
> should be good enough.
>
> For systemd we need the same, but triggered by notification. The
> ideal
> solution is to get mount entry ID or FD from notification and later
> use this
> ID or FD to ask for details about the mount entry (probably again
> fsinfo()).
> The notification has to be usable with in epoll() set.
>
> This solves 99% of our performance issues I guess.
>
> > > So that means mount table info. needs to be maintained, whether
> > > that
> > > can be achieved using sysfs I don't know. Creating and
> > > maintaining
> > > the sysfs tree would be a big challenge I think.
>
> It will be still necessary to get complete mount table sometimes,
> but
> not in performance sensitive scenarios.

That was my understanding too.

Mount table enumeration is possible with fsinfo() but you still
have to handle each and every mount so improvement there is not
going to be as much as cases where the proc mount table needs to
be scanned independently for an individual mount. It will be
somewhat more straight forward without the need to dissect text
records though.

>
> I'm not sure about sysfs/, you need somehow resolve namespaces, order
> of the mount entries (which one is the last one), etc. IMHO translate
> mountpoint path to sysfs/ path will be complicated.

I wonder about that too, after all sysfs contains a tree of nodes
from which the view is created unlike proc which translates kernel
information directly based on what the process should see.

We'll need to wait a bit and see what Miklos has in mind for mount
table enumeration and nothing has been said about name spaces yet.

While fsinfo() is not similar to proc it does handle name spaces
in a sensible way via. file handles, a bit similar to the proc fs,
and ordering is catered for in the fsinfo() enumeration in a natural
way. Not sure how that would be handled using sysfs ...

Ian