Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17]

From: Miklos Szeredi
Date: Thu Feb 27 2020 - 09:46:02 EST


On Thu, Feb 27, 2020 at 12:34 PM Ian Kent <raven@xxxxxxxxxx> wrote:
>
> On Thu, 2020-02-27 at 10:36 +0100, Miklos Szeredi wrote:
> > On Thu, Feb 27, 2020 at 6:06 AM Ian Kent <raven@xxxxxxxxxx> wrote:
> >
> > > At the least the question of "do we need a highly efficient way
> > > to query the superblock parameters all at once" needs to be
> > > extended to include mount table enumeration as well as getting
> > > the info.
> > >
> > > But this is just me thinking about mount table handling and the
> > > quite significant problem we now have with user space scanning
> > > the proc mount tables to get this information.
> >
> > Right.
> >
> > So the problem is that currently autofs needs to rescan the proc
> > mount
> > table on every change. The solution to that is to
>
> Actually no, that's not quite the problem I see.
>
> autofs handles large mount tables fairly well (necessarily) and
> in time I plan to remove the need to read the proc tables at all
> (that's proven very difficult but I'll get back to that).
>
> This has to be done to resolve the age old problem of autofs not
> being able to handle large direct mount maps. But, because of
> the large number of mounts associated with large direct mount
> maps, other system processes are badly affected too.
>
> So the problem I want to see fixed is the effect of very large
> mount tables on other user space applications, particularly the
> effect when a large number of mounts or umounts are performed.
>
> Clearly large mount tables not only result from autofs and the
> problems caused by them are slightly different to the mount and
> umount problem I describe. But they are a problem nevertheless
> in the sense that frequent notifications that lead to reading
> a large proc mount table has significant overhead that can't be
> avoided because the table may have changed since the last time
> it was read.
>
> It's easy to cause several system processes to peg a fair number
> of CPU's when a large number of mounts/umounts are being performed,
> namely systemd, udisks2 and a some others. Also I've seen couple
> of application processes badly affected purely by the presence of
> a large number of mounts in the proc tables, that's not quite so
> bad though.
>
> >
> > - add a notification mechanism - lookup a mount based on path
> > - and a way to selectively query mount/superblock information
> based on path ...
> >
> > right?
> >
> > For the notification we have uevents in sysfs, which also supplies
> > the
> > changed parameters. Taking aside namespace issues and addressing
> > mounts would this work for autofs?
>
> The parameters supplied by the notification mechanism are important.
>
> The place this is needed will be libmount since it catches a broad
> number of user space applications, including those I mentioned above
> (well at least systemd, I think also udisks2, very probably others).
>
> So that means mount table info. needs to be maintained, whether that
> can be achieved using sysfs I don't know. Creating and maintaining
> the sysfs tree would be a big challenge I think.
>
> But before trying to work out how to use a notification mechanism
> just having a way to get the info provided by the proc tables using
> a path alone should give initial immediate improvement in libmount.

Adding Karel, Lennart, Zbigniew and util-linux@xxxxxxx

At a quick glance at libmount and systemd code, it appears that just
switching out the implementation in libmount will not be enough:
systemd is calling functions like mnt_table_parse_*() when it receives
a notification that the mount table changed.

What is the end purpose of parsing the mount tables? Can systemd guys
comment on that?

Thanks,
Miklos