Re: [patch 2/6] fs: no games with DCACHE_UNHASHED

From: Nick Piggin
Date: Thu Oct 15 2009 - 04:15:22 EST


On Thu, Oct 15, 2009 at 09:44:35AM +0200, Eric Dumazet wrote:
> npiggin@xxxxxxx a écrit :
> > (this is in -mm)
> >
> > Filesystems outside the regular namespace do not have to clear DCACHE_UNHASHED
> > in order to have a working /proc/$pid/fd/XXX. Nothing in proc prevents the
> > fd link from being used if its dentry is not in the hash.
> >
> > Also, it does not get put into the dcache hash if DCACHE_UNHASHED is clear;
> > that depends on the filesystem calling d_add or d_rehash.
> >
> > So delete the misleading comments and needless code.
> >
>
> This was added in commit 304e61e6fbadec586dfe002b535f169a04248e49
>
> [PATCH] net: don't insert socket dentries into dentry_hashtable
>
> We currently insert socket dentries into the global dentry hashtable. This
> is suboptimal because there is currently no way these entries can be used
> for a lookup(). (/proc/xxx/fd/xxx uses a different mechanism). Inserting
> them in dentry hashtable slows dcache lookups.
>
> To let __dpath() still work correctly (ie not adding a " (deleted)") after
> dentry name, we do :
>
> - Right after d_alloc(), pretend they are hashed by clearing the
> DCACHE_UNHASHED bit.
>
> - Call d_instantiate() instead of d_add() : dentry is not inserted in
> hash table.
>
> __dpath() & friends work as intended during dentry lifetime.
>
> - At dismantle time, once dput() must clear the dentry, setting again
> DCACHE_UNHASHED bit inside the custom d_delete() function provided by
> socket code, so that dput() can just kill_it.
>
> Signed-off-by: Eric Dumazet <dada1@xxxxxxxxxxxxx>
> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> Acked-by: "David S. Miller" <davem@xxxxxxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxx>
>
>
>
> Back in 2006, we had to perform this hack in order to not leak '(deleted)' in __d_path()
>
> if (!IS_ROOT(dentry) && d_unhashed(dentry) &&
> (prepend(&end, &buflen, " (deleted)", 10) != 0))
> goto Elong;
>
> In current kernel this part became :
>
> if (d_unlinked(dentry) &&
> (prepend(&end, &buflen, " (deleted)", 10) != 0))
> goto Elong;
>
>
> So your cleanup seems good, thanks !
>
> Acked-by: Eric Dumazet <dada1@xxxxxxxxxxxxx>

Ahh, hmm d_unlinked() is exactly the same code. OK, so I think this shows
why I'm an idiot. I didn't think about the __d_path output. Though the
comments are still misleading, in my defence (and my test programs did not
use any anoninode fds)...

Now both sockets and pipes define a d_dname so they are OK, but anon_inodes
does not. I think they should probably be made to just provide a d_dname
anyway so we can have the familiar format of "pseudofs:[ino]" rather than
"[pseudofs]" that we have now.

That should make this patch work for anon_inodes.c as well.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/