Re: [patch 14/22] pollfs: pollable futex

From: Davide Libenzi
Date: Thu May 03 2007 - 14:25:18 EST



I thought you were talking about the poll/epoll interface in general, and
the approach on how to extend it for the very few cases that ppl asks for.
but I see we're focusing on futexes ...


On Thu, 3 May 2007, Ulrich Drepper wrote:

> On 5/2/07, Davide Libenzi <davidel@xxxxxxxxxxxxxxx> wrote:
> > 99% of the fds you'll find inside an event loop you care to scale about,
> > are *already* fd based.
>
> You are missing the point. To get acceptable behavior of the wakeup
> it is necessary with this approach to open one descriptor _per thread_
> for a futex. Otherwise all threads get woken upon FUTEX_WAKE.
>
> This also means you need individual epoll sets for each thread. You
> cannot share them anymore among all the threads in the process.

I'm not sure if futexes are the best approach to do that, but a way for
the user to signal an event into a main event loop is needed.



> > On top of that, those fds are very cheap in terms of memory
>
> They might be when they are counted in dozens. But here we are
> talking about the possible need to use thousands of additional file
> descriptors. If they are so cheap to allow thousands of descriptors
> with ease, why would the rlimit for files default to a small number
> (1024 on Fedora right now)?

Right now, ppl do that using pipes. That costs 2 file descriptors and at
least 4KB of kernel data (plus an inode, a dentry and a file). This just
to have a way to signal to an event loop dispatcher. The patches I posted
a few weeks ago introduce an eventfd, that reduces the amount of kernel
memory to basically a dentry and a file (plus uses only one file
descriptor, and its 2-3 times faster than pipes. Add to that cost, about
200 lines of code in fs/eventfd.c.



> > And this approach is not bound to a completely new and monolitic interface.
>
> So? It's stil additional, new code for an approach which will have to
> be superceded real soon. That's just pure overhead to me.

IMO it is better to leave futexes alone. They are great for syncronizing
MT apps, but do not properly fit an fd-based solution. For that, something
like eventfd is enough.



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/