[RFC v2 3/2] epoll: avoid using extra cache line on most 64-bit

From: Eric Wong
Date: Thu Mar 21 2013 - 18:12:50 EST


By moving the events field epitem, we can avoid dirtying (or even
loading) an extra cache line on 64-bit machines with 64-byte cache
lines. Since EPOLLWAKEUP is uncommonly used, we add an additional check
for the EPOLLWAKEUP flag to avoid reading a second cache line for
the wakeup_source.

This allows ep_send_events to only read/write the top 64-bytes of an
epitem in common cases.

This patch was only made possible by the smaller footprint required
by wfcqueue.

epwbench test timings:

Before (without wfcq at all):
AVG: 5.448400
SIG: 0.003056

Before (with wfcq local):
AVG: 5.532024
SIG: 0.000244

After (this commit):
AVG: 5.331539
SIG: 0.000234

Even with the variability between runs on my KVM, I'm confident this
wfcqueue epoll series introduces no performance regressions in the
common single-threaded use cases of epoll.

ref: http://www.xmailserver.org/epwbench.c

Somewhat-tested-by: Eric Wong <normalperson@xxxxxxxx>
Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
Cc: Davide Libenzi <davidel@xxxxxxxxxxxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---
fs/eventpoll.c | 27 +++++++++++++++++++++++----
1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 1e04175..82bf483 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -155,12 +155,27 @@ struct epitem {
/* The file descriptor information this item refers to */
struct epoll_filefd ffd;

- /* Number of active wait queue attached to poll operations */
+ /*
+ * Number of active wait queue attached to poll operations
+ * This is infrequently used, it pads well here but may be
+ * removed in the future
+ */
int nwait;

/* state of this item */
enum epoll_item_state state;

+ /* The structure that describe the interested events and the source fd */
+ struct epoll_event event;
+
+ /*
+ * --> 64-byte boundary for 64-bit systems <--
+ * frequently accessed (read/written) items ar above this comment
+ * infrequently accessed items are below this comment
+ * Keeping frequently accessed items within the 64-byte boundary
+ * prevents extra cache line usage on common x86-64 machines
+ */
+
/* List containing poll wait queues */
struct list_head pwqlist;

@@ -172,9 +187,6 @@ struct epitem {

/* wakeup_source used when EPOLLWAKEUP is set */
struct wakeup_source __rcu *ws;
-
- /* The structure that describe the interested events and the source fd */
- struct epoll_event event;
};

/*
@@ -596,6 +608,13 @@ static void ep_unregister_pollwait(struct eventpoll *ep, struct epitem *epi)
/* call only when ep->mtx is held */
static inline struct wakeup_source *ep_wakeup_source(struct epitem *epi)
{
+ /*
+ * avoid loading the extra cache line on machines with
+ * <= 64-byte cache lines
+ */
+ if (!(epi->event.events & EPOLLWAKEUP))
+ return NULL;
+
return rcu_dereference_check(epi->ws, lockdep_is_held(&epi->ep->mtx));
}

--
Eric Wong

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/