Re: [Call For Testers][VFS] Rename stuff.

Alexander Viro (viro@math.psu.edu)
Tue, 9 Feb 1999 21:44:00 -0500 (EST)


On Tue, 9 Feb 1999, Kenneth Preslan wrote:

> > coda: no rename/rename protection, no is_subdir. Mommy, server will do the
> > right thing, certainly it will not screw me up, why would it?
> ...
> > nfs: no checks, no nothing. Fixed.
>
> I think you're missing a big point here. Rename serialization MUST be done
> at the server. The semaphore you're taking out protects against two different
> processes doing renames at the same time. It doesn't protect against two
> different processes on *different machines* renaming at the same time.
> Their thinking was: "All the is_subdir() stuff already happens on the server,
> so why do it twice?"

Because doing it twice means that server going ballistic will not
leave client's dcache in inconsistent state. Be conservative in what you
send and all such. Indeed, checks must be done on server's side *too*.

> I work on GFS (http://gfs.lcse.umn.edu). Shifting the is_subdir() test into
> the VFS does nothing for me except make my file system slower.

There is a kinder serialization and I'll probably do it in the
final variant anyway - instead of holding a per-fs semaphore we use
wait queue and can do the following (instead of down/is_subdir stuff):

retry:
fence=new_dentry;
while( !IS_ROOT(fence) && fence!=old_dentry && !(fence->d_flags &
D_SOURCE))
fence = fence->d_parent;
if (fence == source)
return -EINVAL;
if (fence == root)
goto everything_clear;
for ( p=old_dentry; p!=fence && !IS_ROOT(p); p=p->d_parent)
;
if (p!=fence) {
sleep_on(&sb->s_vfs_rename_wait);
goto retry;
}
everything_clear:
old_dentry->d_flags |= D_SOURCE;
/* do rename */
old_dentry->d_flags &= ~D_SOURCE;
wake_up(&sb->s_vfs_rename_wait);

That's it. In other words, we find the closest ancestor of target that is
source of some rename in progress (catching is_subdir() cases in the same
loop) or is root. Then if it is the ancestor of source too we can go ahead
since this rename() is commutative with all renames already in progress.
Otherwise we'll have to wait for some of them to finish and recheck the
conditions later. That way we can guarantee that fs will never receive a
rename() request that is not independant from all requests already in
progress. This scheme is finer than 'no concurrent renames within the same
fs' one.

> The VFS layer grabs the rename semaphore, does the is_subdir() check, and
> calls my rename() i_op. Then I have to get the GFS inter-machine rename lock
> and do the is_subdir() check again. If I don't, I leave myself open to
> race conditions.

Certainly, for networking filesystem you have to do checks on
server's side *anyway*. Hell, not to mention races, clients may be
compromised, go mad, whatever.

> This patch your proposing doesn't hurt the functionality of any file system.
> In that respect it's ok. And the performance hit of walking the dcache to do
> the check isn't too bad, but I just wanted to point out what you're doing.
I know. The point being, you can leave to the server all checks
that do not affect the consistency of client's state, but leaving *all*
checks to the server is calling for trouble.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/