[RFC v2] epoll: avoid spinlock contention with wfcqueue

From: Eric Wong
Date: Mon Mar 18 2013 - 07:07:36 EST


Eric Wong <normalperson@xxxxxxxx> wrote:
> I'm posting this lightly tested version since I may not be able to do
> more testing/benchmarking until the weekend.

Still lightly tested (on an initramfs KVM, no real applications, yet).

> Davide's totalmess is still running, so that's probably a good sign :)
> http://www.xmailserver.org/totalmess.c

Ditto :) Also testing with eponeshotmt, which is close to my target
use case: http://yhbt.net/eponeshotmt.c

> I will look for more ways to break this (and benchmark when I stop
> finding ways to break it). No real applications tested, yet, and
> I think I can improve upon this, too.

No real apps, yet, and I need to make sure this doesn't cause
regressions for the traditional single-threaded event loop case.

This is the use case I mainly care about (multiple tasks calling
epoll_wait(maxevents=1) to divide work).

Time to wait on 4 million events (4 threads generating events,
4 threads calling epoll_wait(maxevents=1) 1 million times each,
10 eventfd file descriptors (fewer descriptors means higher
chance of contention for epi->state inside ep_poll_callback).

Before:
$ eponeshotmt -t 4 -w 4 -f 10 -c 1000000
real 0m 9.58s
user 0m 1.22s
sys 0m 37.08s

After:
$ eponeshotmt -t 4 -w 4 -f 10 -c 1000000
real 0m 6.49s
user 0m 1.28s
sys 0m 24.66s

(KVM - AMD Phenom II X4 @ 3.0 GHz 4-cores)

> This depends on a couple of patches sitting in -mm and a few
> more I've posted on LKML, for convenience everything is here:

> (should apply cleanly to 3.9-rc* since there's no epoll changes in that)

http://yhbt.net/epoll-wfcqueue-v3.8.3-20130318.mbox

--------------------------------8<------------------------------