Re: [PATCH v2 2/2] epoll: introduce EPOLLEXCLUSIVE and EPOLLROUNDROBIN

From: Eric Wong
Date: Wed Feb 18 2015 - 17:18:14 EST


Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> * Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> > > [...] However, I think the userspace API change is less
> > > clear since epoll_wait() doesn't currently have an
> > > 'input' events argument as epoll_ctl() does.
> >
> > ... but the change would be a bit clearer and somewhat
> > more flexible: LIFO or FIFO queueing, right?
> >
> > But having the queueing model as part of the epoll
> > context is a legitimate approach as well.
>
> Btw., there's another optimization that the networking code
> already does when processing incoming packets: waking up a
> thread on the local CPU, where the wakeup is running.
>
> Doing the same on epoll would have real scalability
> advantages where incoming events are IRQ driven and are
> distributed amongst multiple CPUs.

Right. One thing in the back of my mind has been to have CPU
affinity for epoll. Either having everything in an epoll set
favor a certain CPU or even having affinity down to the epitem
level (so concurrent epoll_wait callers end up favoring the
same epitems).

I'm not convinced this series is worth doing without a
comparison against my previous suggestion to use a dedicated
thread which only makes blocking accept4 + EPOLL_CTL_ADD calls.

The majority of epoll events in a typical server should not be
for listen sockets, so I'd rather not bloat existing code paths
for them. For web servers nowadays, the benefits of maintaining
long-lived connections to avoid handshakes is even more
beneficial with increasing HTTPS and HTTP2 adoption; so
listen socket events should become less common.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/