> You will need it. After a reboot, clients may continue to use file handles
> referencing the old dentry for a long time: think of the file handle
> of the mount point itself, even though that's not a huge problem as
> it's only used in lookup calls which tend to make up 1% of NFS traffic.
> Worse, think of someone running long-lived applications off an NFS-mounted
> directory (mail readers, editors, CAD programs...). You really don't
> want to do a bottom-up path reconstruction on each page-in.
I've implemented a cache for saving the results when it has to do a
lookup, so this will prevent doing multiple lookups. I'm expiring these
patchups after 2 minutes, so a very long-lived filehandle may have to be
looked up more than once, but this will be a rare occurrence. Following
a reboot the knfsd should return to normal efficiency very quickly --
within a minure or two I would guess.
> If you use a reasonable hash algorithm, you may even be better off leaving
> the dentry out of the FH altogether and use just the ino/dev-to-dentry
> cache.
The dentry in the ino/dev cache still has to be validated, as I can't
hold it with a use count. As long as the dentry validation logic has to
be there, having the dentry in the fh will be a big win -- I suspect it
will validate OK in 80-90% of cases.
> There's one problem with the cache, though. Entries in the cache must have
> a _very_ short TTL on the order of 2 seconds. One of the most
> frequently asked questions about unfsd was `why can't I unmount my CDROM?',
> which was because unfsd was holding file descriptors to open fds on it...
Would it help to add a cache flush when removing an export? If knfsd
could get some notification that the user was about to umount a volume,
the dentry cache could be selectively flushed.
> Still, no-one has answered me why knfsd can't use the following approach:
>
> /* Look up super block for device */
> sb = get_super_block(fh->fh_dev);
>
> /* Look up inode by number */
> inode = __iget(sb, fh->fh_ino, 0)
>
> /* Create a fancy name for it... */
> sprintf(namebuf, "%x-%d", fh->fh_dev, fh->fh_ino);
> /* ... and create the qstr hash of it */
> /* (see namei.c) */
>
> /* Try to look it up under fake knfsd root */
> dentry = d_lookup(fakeroot, &qstr);
> if (dentry == NULL) {
> /* not found - create and attach it */
> dentry = d_alloc(fakeroot, namebuf);
> d_add(fakeroot, dentry);
> }
>
> This should produce a valid dentry (this is also what the procfs does
> for stuff like the fd/* and cwd entries).
This would probably work fine for reading and writing, but break in
subtle ways for operations that update the dcache. Most of the fs assume
that the dentries they're given for an operation represent the fs
objects in the same way the fs does. Adding these fake dentries would be
like adding hard links from outside the fs without the knowledge of the
fs.
Anyway, the new lookup code is working fine and I've released a patch
against 2.1.65. Once people start testing the new FH code we'll get an
idea of how it scales, but I really don't expect knfsd to spend very
much time validating/looking up filehandles. At some point I'll add
some statistics to track validation hits and misses, lookups needed,
etc.
Regards,
Bill