Re: PATCH: smart symlink loop detection.

George Famelis (gf@eetaa.gr)
Thu, 16 Apr 1998 12:56:01 +0300


Hi

: C. Scot Ananian
> > Well, here's the problem, as pointed out to me this afternoon by Mr. Linus
> > Torvalds: symlink traversal is inherently recursive. We can rewrite the
> > tail recursion as iteration (which my morning patch did), but symlinks in
> > the middle of paths require mid-recursion. You need to keep a stack
> > around *somewhere* to handle this.
>
: Adam D. Bradley
> Yeah, I read Linus's message, and have been wrestling with various
> pseudo-code algorithms for a little while, trying to show how this
> thing can be done iteratively. A basic model (similar to the one
> givcen by Linus):
>

Let me try to show that there is no middle recursion and that we allways
we have tail recursion.
Here are some definitions:

<FI> := <real-file-ident> || <symlink-file-ident> # a File Identifier
<DI> := <real-dir-ident> || <symlink-dir-ident> # a Directory
Identifier
<UF> := <UP> || <UP><FI> # an Unresolved File Name (i.e some components
may be symlinks)
<UP> := <empty> || <RP> || <UP><DI>"/" # an Unresolved Path
<RP> := "/" || <RP><real-dir-name>"/" # a Real Path (i.e. it has no
components that are symlinks)
<RF> := <RP> || <RP><real-file-name> # a Real File Name (i.e. the one
with no symlink components)
<symlink-any-ident> := <symlink-file-ident> || <symlink-dir-ident>

The contents of <symlink-dir-ident> is a <UP> and
the contents of <symlink-file-ident> is a <UF>.

Our problem is to find the <RF> for a given <UF>.

I assume that get_cwd() returns the <RP> of Current Working Directory

For the loop detection we need something unique and I suggest that per
file system the inode of the
<ident> at hand is unique. If this is true (I don't know if this holds
for all types of file systems) we can have the pair {FileSystem,
inode-number} to be the thinks we remember for all the
inodes we have visited.

For any given <UF> there are two cases:

Case 1) <UF> is a relative name (i.e. not starting with a "/")
=> get_cwd() + <UF> is in Case 2 format

Case 2) <UF> is an absolute name
=> we can always starting from the begining to split it in one of the
forms
Case 2.a) <RP><real-file-ident> or
Case 2.b) <RP><symlink-file-ident> or
Case 2.c) <RP><symlink-dir-ident><UF>

In case 2.a we are done.

In cases 2.b,2.c we call the loop detecting algorithm with the triple
{FileSystem, inode-number_of(<symlink-any-ident>}

if a loop is found then we are done

else if <symlink-any-ident> contains an absolute name then the new <UF>
to check is
(Case 2.b.1) => new<UF> := <contents-of-absolute-symlink-file-ident> =>
(Case 2)
(Case 2.c.1) => new<UF> := <contents-of-absolute-symlink-dir-ident><UF>
=> (Case 2)

else <symlink-any-ident> contains a relative name and the new <UF> is
(Case 2.b.2) => new<UF> := <RP><contents-of-relative-symlink-file-ident>
=> (Case 2)
(Case 2.c.2) => new<UF> :=
<RP><contents-of-relative-symlink-dir-ident><UF> => (Case 2)

In any case we continue either from the begining of the new<UF> (Cases
2.[bc].1)
or from the end of <RP> of the new<UF> (Cases 2.[bc].2)
So in any case we have an iteration.

Let me modify what Adam D. Bradley wrote

+ char *PathStart;
+ char *SaveStr(char *, ...);
+ void FreeStr(char *); /* Free what SaveStr has allocated */
+ char *GetNameOf(dentry *);
+ char *SymlinkValue(dentry *); /* where the symlink points to */
+ void LoopDetectInit();
+ void LoopTrail(dentry *); /* remember the places where we have
been */
+ bool LoopDetect(dentry *);

> dentry *lookup_dentry(char *name)
> {
> dentry *here, *newhere;
+ char *IdentAtHand;
+ char *Symlink;
>
+ /* Do not FreeStr(name) here is not ours */
> if (*name=='/') {
+ PathStart = SaveStr(name);
+ name = PathStart;
+ name++;
> - here = get_root();
+ newhere = get_root();
> - here++;
> } else {
> newhere = get_cwd();
+ ItemAtHand = GetNameOf(newhere);
+ PathStart = SaveStr(ItemAtHand, name)
+ name = &PathStart[strlen(ItemAthand)];
+ FreeStr (ItemAtHand);
> }
> /* "here" will always point to either a directory,
> a symlink, or a file. */
>
+ LoopDetectInit();
+ LoopTrail(newhere);
> for(;;) {
+ here = newhere;
+ ItemAthand = name;
> subname = get_first_part_of(name);
> if (!subname)
> break; /* we've exhausted the name passed to us */
> name += strlen(subname);
> for ( ; *name=='/' ; name++); /* lob off spare /'s */
>
> - if (!IS_DIR(newhere->inode))
+ if (!IS_DIR(here->inode))
> - return -EEXIST;
+ goto Failed;
>
> newhere = lookup_within_directory(here, subname);
>
+ LoopTrail(newhere);
+ if (!IS_SYMLINK(newhere->inode)
+ continue;
+
+ /* now we have a symlink */
+ if (LoopDetect(newhere))
+ goto Failed;
+
+ Symlink = SymlinkValue(newhere->inode);
+ if (*Symlink == "/") { /* an absolute path */
+ name = SaveStr(Symlink, "/", name);
+ FreeStr(SymLink);
+ FreeStr(name);
+ newhere = get_root();
+ goto Restart;
+ } else { /* a relative path */
+ *ItemAtHand = '\0';
+ ItemAtHand = PathStart;
+ PathStart = SaveStr(PathStart, SymLink, "/",
name);
+ FreeStr (SymLink);
+ FreeStr (name);
+ name = &PathStart[strlen(ItemAtHand)];
+ FreeStr (ItemAtHand);
+ newhere = here; /* we continue to be at the
same place */
+ continue;
+ }

> - while (IS_SYMLINK(newhere.inode)) {
> - newhere = traverse_one_step(newhere);
> - if (!newhere)
> - return -EEXIST;
> - }
> -
> - here = newhere;
> - /* Now, "here" is either a dir or a file. If a file,
> - then we detect /path/file/bogus-path above.
> - If a directory, we're set... either the given
> - path is exhausted and the loop breaks, or
> - there's more to look up using l_w_d() */
> }
>
+ FreeStr(ItemAtHand);
+ FreeStr(name);
+ /* PathStart contains the Real Path of argument given */
> return here;
+ Failed:
+ FreeStr(ItemAtHand);
+ FreeStr(PathStart);
+ FreeStr(name);
+ PathStart = NULL;
+ return (dentry *)NULL;
> }
>

Regards,

George

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu