Re: [RFC PATCH -v4 00/14] fsnotify, dnotify, and inotify

From: Al Viro
Date: Thu Dec 25 2008 - 20:44:19 EST


On Thu, Dec 25, 2008 at 07:58:23PM -0500, C. Scott Ananian wrote:
> > Indeed - explicit, persistent and wrong. For current directory of a process
> > we store vfsmount and dentry. And use those in getcwd() rather than playing
> > hopeless games with inodes.
>
> Geez. Please don't treat me as if I can't read source code.
>
> I suggested a Mach-like iopen mechanism to address some inotify races.
> In order to show that extreme VFS violence might not be necessary, I
> pointed out that *in some cases* you can derive *some paths* to the
> file from the inode number, using the iget()->i_dentry list. But
> you've driven me far off-topic.

BTW, XFS open-by-inode implementation is severely broken - try using
it for directories and you'll get all kinds of hell breaking loose.
And that's the only attempt at iopen in Linux.

While we are at it, iget() is not a generic interface - it's strictly
up to specific filesystem whether that sucker will work at all. Or
what kind of data will have to be required for it to work on given fs,
assuming it will work to start with.

> Let's get back to the problem at hand. The most obvious problem with
> inotify is the race between mkdir/IN_CREATE and the userspace process

[snip]

> As far as I can tell, none of the existing Linux desktop search tools
> attempt to deal with these races. (Beagle handles the 'mkdir' race,
> but not the other rename races.) This is acceptable only if an
> unreliable file index is acceptable.

... and inotify is unreliable by design, or what passes for it anyway.

> Some possible improvements to the situtation (all bad, in various ways
> -- better suggestions wanted!):
>
> a) do nothing. Most developers will ignore the races in inotify out
> of ignorance or complexity, and most applications which use inotify
> will be unreliable as a result.
>
> b) use inode numbers rather than path names uniformly, in both
> inotify and the userland search index, along with an iopen() syscall,
> as in Mach. This decouples path maintenance from indexing. This was
> discussed in (for example)
> http://www.coda.cs.cmu.edu/maillists/codalist/codalist-1998/0217.html
> by Peter J. Braam and Ted Ts'o, but Al Viro has been objecting to the
> idea here. (If all you need to do is open found files after a search,
> you can skip path maintenance entirely.)

For one thing, opened directory can be fchdir'ed to. Or passed to
openat() et.al. For another, go ahead, show me how to implement that
sucker for something like NFS. Or CIFS. Or FAT, while we are at it.
Or anything that does have stable numbers identifying fs objects, but where
those numbers are huge.

> c) Pass file descriptors in the notification API from the kernel.
> This solves the races associated with renames before indexing.
> Userland still has to maintain its own copy of all the direntries for
> indexed content, but at least this task is decoupled. (The proposed
> fanotify API passes file descriptors, but provides no mechanism (yet)
> for path maintenance.)
>
> d) Do all indexing in the filesystem. BeOS used this option; in
> Linux-land, this would probably be a thin FUSE shim which layered over
> an existing filesystem. The shim could grab the appropriate locks to
> manage the races and ensure that the index's path information was
> consistent with the filesystem.

e) start with providing a higher-level description of what you want to
achieve. While we are at it, is it supposed to be fs-agnostic? What
kinds of filesystems could in principle be used with that?

So far you've mentioned use of blatantly inadequate tool and far too
low-level description of changes that might possibly make it more
tolerable for unspecified use you have in mind. I'm sorry, but that's
exactly how a bunch of bad APIs (including inotify) got pushed into the
tree and I would rather not repeat the experience.

Interfaces must make sense. And "we need <list of things> for our project,
nevermind what are they for, here's what we would like to see" is a recipe
for kludges. Inotify, dnotify, F_SETLEASE, etc. Sigh...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/