Re: [PATCH 18/23] vfs: Teach epoll to use file_hotplug_lock

From: Davide Libenzi
Date: Tue Jun 02 2009 - 17:59:01 EST


On Tue, 2 Jun 2009, Eric W. Biederman wrote:

> Davide Libenzi <davidel@xxxxxxxxxxxxxxx> writes:
>
> > On Mon, 1 Jun 2009, Eric W. Biederman wrote:
> >
> >> From: Eric W. Biederman <ebiederm@xxxxxxxxxxxxxxxxxxxxxxxxxx>
> >>
> >> Signed-off-by: Eric W. Biederman <ebiederm@xxxxxxxxxxxxxxxxxx>
> >> ---
> >> fs/eventpoll.c | 39 ++++++++++++++++++++++++++++++++-------
> >> 1 files changed, 32 insertions(+), 7 deletions(-)
> >
> > This patchset gives me the willies for the amount of changes and possible
> > impact on many subsystems.
>
> It both is and is not that bad. It is the cost of adding a lock.

We both know that it is not only the cost of a lock, but also the
sprinkling over a pretty vast amount of subsystems, of another layer of
code.



> I thought of doing something more uniform to user space. But I observed
> that the existing epoll punts on the case of a file descriptor being closed
> and locking to go from a file to the other epoll datastructures is pretty
> horrid I said forget it and used the existing close behaviour.

Well, you cannot rely on the caller to tidy up the epoll fd by issuing an
epoll_ctl(DEL), so you do *need* to "punt" on close in order to not leave
lingering crap around. You cannot even hold a reference of the file, since
otherwise the epoll hooking will have to trigger not only at ->release()
time, but at every close, where you'll have to figure out if this is the
last real userspace reference or not. Plus all the issues related to
holding permanent extra references to userspace files.
And since a file can be added in many epoll devices, you need to
unregister it from all of them (hence the other datastructures lookup).
Better this, on the slow path, with locks acquired only in the epoll usage
case, than some other thing and on the fast path, for every file.



- Davide


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/