Re: [patch] epoll use a single inode ...

From: Linus Torvalds
Date: Wed Mar 07 2007 - 12:23:56 EST




On Wed, 7 Mar 2007, Eric Dumazet wrote:
>
> Crazy ideas : (some readers are going to kill me)

First off, as noted earlier, you don't need crazy ideas.

But:

> 1) Use the low order bit of f_path.dentry to say : this pointer is not a
> pointer to a dentry but the inode pointer (with the low order bit set to 1)

No, we don't want to do that. There's a lot of code that wants to just
follow the dentry/inode chain unconditionally, and we want to have people
able to just do

i_size_read(file->f_path.dentry->d_inode);

kind of thing, without having to check anything in between. It's costly
enough that we have a chain of pointers, it's much *worse* if we then have
to make conditionals etc (and forgetting is an oops).

> 2) file->f_path.dentry set to NULL for this special files (so that we dont
> need to dput() and cache line ping pong the common dentry each time we
> __fput() a pipe/socket.

I agree abot the ping pong, but I suspect it's a lot less than the current
pingpong on dcache_lock, which is a lot hotter than a pipe-dentry counter
would be.

If it becomes a big issue, we could just have a fixed number of dentries,
and hash them out some way (to dilute the ping-ponging). So making things
NULL would just be horrible for everybody, since we have lots of generic
code that just looks up the dentry and inode, and we don't want to make
that worse.

(I'm sure using "filp->f_private_data" has some downsides too, but it
seems fairly simple at least for pipes. The biggest problem is likely that
we currently use the mutex in the inode in "filp->f_dentry->d_inode" as a
way to protect the inode->i_pipe pointer itself, and there is no
equivalent locking at the "struct file *" level at all. We might have to
make the "f_ep_lock" spinlock be unconditional)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/