Re: I discussed reading directories as files with jra, Stallman,

Alexander Viro (viro@math.psu.edu)
Sun, 20 Jun 1999 20:06:55 -0400 (EDT)


On Sun, 20 Jun 1999, Linus Torvalds wrote:

> > Linus, I see your point here, but IMO it means only one thing - that we
> > should stop pretending that those objects are symlinks. They are
> > different. Yes, we have one more type of object. Call it VFS-link,
> > wormhole, whatever.
>
> They are NOT symlinks. They never were. It so happens that the VFS
> _method_ is called "follow_link", but yes, you might as well call it
> "wormhole".
>
> It so happens that for a traditional "unix" symlink, the "wormhole()"
> method implies reading the link and looking it up.

Good. The only question being: why are we pushing the traditional symlinks
under the same roof? That's what makes me very uncomfortable. And it makes
lookup_dentry() pretty uncomfortable too - we got *way* too nasty
recursion there. I'm not suggesting 4.4BSD method - they just push the
contents of symlink (I'll use this word *only* for traditional ones, OK?)
into the buffer that contains the rest of name. It's ugly as hell and we
can do much better. Look - for symlinks we are doing the following:
->follow_link:
allocate some resource (kmalloced buffer, buffer-head, page, etc.)
and read the contents of symlink there.
call lookup_dentry
release the resource.
->readlink:
allocate....
copy to user buffer.
release the resource.
(for *normal* symlinks, that is). Absolutely trivial modification of that
being:
add a structure with pointer to name, destructor and argument for
destructor.
add a new method that gets a dentry of symlink + pointer to
empty structure of that kind and fills said structure.
make 3 places in the kernel aware of new method -
do_follow_link(), sys_readlink() and nfsd_readlink().

Slightly more clever scheme being:
add a new field to struct inode - pointer to such object (NULL
upon read_inode()).
add a new function: vfs_readlink(). Said function checks if inode
has an associated structure (in which case it simply returns it) or
allocates a new structure, uses ->k_readlink() to fill it and attaches it
to inode.
let the aforementioned 3 functions call vfs_readlink() if they see
that ->k_readlink() is there.
let either clear_inode() or the functions in question detach the
structure from inode and kill it (I made it a policy settable on per-inode
basis).

What it gives:
symlink-related code duplication in normal filesystems is gone.
recursion is *much* milder - instead of lookup_dentry ==>
do_follow_link ==> foofs_follow_link ==> lookup_dentry we have
lookup_dentry ==> do_follow_link ==> lookup_dentry (all local variables of
foofs_follow_link are out). In other words, we can push the limit on
number of nested symlinks (s/5/35/ as limit for current->link_count,
increment by 1 for normal symlinks and by 7 for special ones).
normal symlinks are cached (if you don't want to cache it - fine,
just set S_VOLATILE_LINK in inode->flags; if you want to keep the sucker
until the inode destruction - set S_LCACHE_LINK).
for special symlinks/wormholes/whatever you need *zero* changes -
old methods are still working. If you want to use them for normal symlinks
- your business. Old source doesn't break.

And the next step being iterative lookup_dentry(). The only remaining
recursion will sit in follow_link() for wormholes. Which are unlinkely to
mess with full-blown lookup_dentry() anyway.

I have everything except the last step implemented and more or less
tested. Notice that it has nothing with symlink caching on page level -
it's an independent mechanism. They coexist just fine (for NFS).

If you want to have independent follow_link() and readlink() - just keep
the old inode_operations for the object in question. No need to change
anything - old code is still working. All changes are for traditional
symlinks - they got a new method.

I'm going to repost the patch as soon as 2.3.7-pre* will settle down. It
works here...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/