Re: NFS file handle cached incorrectly

From: Trond Myklebust
Date: Mon Apr 12 2004 - 16:03:14 EST


På m , 12/04/2004 klokka 13:38, skreiv David Mansfield:

> rm foo; date >foo; sleep 1; cat foo; \
> ssh -x othermachine 'touch foo.new; rm foo; mv foo.new foo'; cat foo; date
>
> Basically, what happens is that the inode (on server) of the file 'foo'
> changes 'out from under' the client (by another client, 'othermachine',
> that has mounted same directory). Client then uses the cached file handle
> and gets a 'stale nfs handle' error visible to user space for the last
> 'cat foo'.
>
> If more than one second passes between the rename on 'othermachine' and
> the 'cat foo' on the client, the problem doesn't appear.
>
> In reality, this is adversely affecting 'ssh' with X11 forwarding (funny
> things happening with the xauth program on either end of the ssh).
>
> Could we retry the LOOKUP if the fh comes from the cache and we get a
> stale file handle error automatically?

No! The file handle is assumed to be correct if the directory
revalidated without any errors. It is cached until it is needed. The
ESTALE occurs long after we've exited from LOOKUP.

> Any other ideas?

I am sorry: you are simply violating the NFS caching premises. This is
something that is not *ever* guaranteed to work whether or not you have
READDIRPLUS enabled.
The problem here is rather that you are making remote modifications to
the NFS server's directory within < 1second (which is the resolution on
"mtime" on Linux 2.4.x) of the previous modification. Linux (and all
other NFS clients that I'm aware of) uses the mtime in order to decide
whether or not a file/directory/... has been modified since the cache
was last updated (unless it is a modification that was made by this
client).

The only "solution" to your problem here is to upgrade the *server* to
Linux-2.6.x: the latter has 1 nanosecond resolution on the "mtime", and
so can register modifications that are far smaller than 1second.

Cheers,
Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/