Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17]

From: Steven Whitehouse
Date: Tue Mar 03 2020 - 05:22:21 EST


Hi,

On 03/03/2020 09:48, Miklos Szeredi wrote:
On Tue, Mar 3, 2020 at 10:26 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
On Tue, Mar 3, 2020 at 10:13 AM David Howells <dhowells@xxxxxxxxxx> wrote:
Miklos Szeredi <miklos@xxxxxxxxxx> wrote:

I'm doing a patch. Let's see how it fares in the face of all these
preconceptions.
Don't forget the efficiency criterion. One reason for going with fsinfo(2) is
that scanning /proc/mounts when there are a lot of mounts in the system is
slow (not to mention the global lock that is held during the read).
BTW, I do feel that there's room for improvement in userspace code as
well. Even quite big mount table could be scanned for *changes* very
efficiently. l.e. cache previous contents of /proc/self/mountinfo and
compare with new contents, line-by-line. Only need to parse the
changed/added/removed lines.

Also it would be pretty easy to throttle the number of updates so
systemd et al. wouldn't hog the system with unnecessary processing.

Thanks,
Miklos


At least having patches to compare would allow us to look at the performance here and gain some numbers, which would be helpful to frame the discussions. However I'm not seeing how it would be easy to throttle updates... they occur at whatever rate they are generated and this can be fairly high. Also I'm not sure that I follow how the notifications and the dumping of the whole table are synchronized in this case, either.

Al has pointed out before that a single mount operation on a subtree can generate a large number of changes on that subtree. That kind of scenario will need to be dealt with efficiently so that we don't miss things, and we also minimize the possibility of overruns, and additional overhead on the mount changes themselves, by keeping the notification messages small.

We should also look at what the likely worst case might be. I seem to remember from what Ian has said in the past that there can be tens of thousands of autofs mounts on some large systems. I assume that worst case might be something like that, but multiplied by however many containers might be on a system. Can anybody think of a situation which might require even more mounts?

The network subsystem had a similar problem... they use rtnetlink for the routing information, and just like the proposal here it contains a dump mechanism, and a way to listen to events (add/remove routes) which is synchronized with that dump. Ian did start looking at netlink some time ago, but it also has some issues (it is in the network namespace not the fs namespace, it also has various things accumulated over the years that we don't need for filesystems) but that was part of the original inspiration for the fs notifications.

There is also, of course, /proc/net/route which can be useful in many circumstances, but for efficiency and synchronization reasons if is not the interface of choice for routing protocols. David's proposal has a number of the important attributes of an rtnetlink-like (in a conceptual sense) solution, and I remain skeptical that a /sysfs or similar interface would be an efficient solution to the original problem, even if it might perhaps make a useful addition.

There is also the chicken-and-egg issue, in the sense that if the interface is via a filesystem (sysfs, proc or whatever), how does one receive a notification for that filesystem itself being mounted until after it has been mounted? Maybe that is not a particular problem, but I think a cleaner solution would not require a mount in order to watch for other mounts,

Steve.