Re: [RFC 2/2] fanotify: emit FAN_MODIFY_DIR on filesystem changes

From: Filip ÅtÄdronskÃ
Date: Tue Mar 14 2017 - 10:58:16 EST


Hi,

On Tue, Mar 14, 2017 at 01:18:01PM +0200, Amir Goldstein wrote:
> I claim that fanotify filters event by mount not because it
> was a requirement, but because it was an implementation challenge
> to do otherwise.
>
> And I claim that what mount watchers are really interested in is
> "all the changes that happen in the file system in the area
> that is visible to me through this mount point".
>
> In other words, an indexer needs to know if files were modified\
> create/deleted if that indexer sits in container host namespace
> regardless if those files were modified from within a container
> namespace.
>
> It's not a matter of security/isolation. It's a matter of functionality.
> I agree that for some event (e.g. permission events) it is possible
> to argue both ways (i.e. that the namespace context should be used
> as a filter for events).
> But for the new proposed events (FS_MODIFY_DIR), I really don't
> see the point in isolation by mount/namespace.

there are basically two classes of uses for a fantotify-like
interface:

(1) Keeping an up-to-date representation of the file system.
For this, superblock watches are clearly what you want.

* You are interested to know the current state of the
filesystem so you need to know about every change,
regardless of where it came from.
* As I mentioned earlier, in case of remote, ditributed
and virtual filesystems, the change might come from
within the filesystem itself (if the protocol supports
reporting such changes). This can probably be
implemented only with superblock-scoped watches because
the change is fundamentally not related to any mount.
* Some filesystems might also support change journalling
and it might be concievable to extend the API in the
future to report "past" events (for example by passing
sequence number of last seen event or similar).
* The argument about containers escaping change notification
you mentioned earlier.

All those factors speak greatly in favour of superblock
watches.

(2) Tracking filesystem *activity*. Now you are not building
an image of current filesystem state but rather a log of
what happened. Perhaps you are also interested in who
(user/process/...) did what. Permission events also fit
mostly in this category.

For those it *might* make sense to have mount-scoped
watches, for example if you want to monitor only one
container or a subset of processes.

We both concentrate on the first but we shouldn't forget about
the second, which was one of the original motivations for
fanotify.

Thus I conclude that it might be desirable to implement
mount-scoped filename events in the long run. Even though
I agree that the sb-scoped events are more important because
they cover more use cases and you can do additional filtering
(e.g. by pid) if deemed necessary.

This would require:

(a) Sprinkling the callers of vfs_* with fanotify calls
as I did, or
(b) Creating wrapper functions like vfs_path_unlink & co.
that would make the necessary fanotify call (and probably
tell the lower function not to generate another
notification), as I suggested earlier.
(c) Give the vfs_* functions an *optional* vfsmount argument.

In the end I probably find (c) the most elegant but this
can be discussed later, even after your changes are merged.

Filip