Re: [PATCHSET v3 0/5] Add support for epoll min_wait

From: Willem de Bruijn
Date: Wed Nov 02 2022 - 13:47:10 EST


On Sun, Oct 30, 2022 at 6:02 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
>
> Hi,
>
> tldr - we saw a 6-7% CPU reduction with this patch. See patch 6 for
> full numbers.
>
> This adds support for EPOLL_CTL_MIN_WAIT, which allows setting a minimum
> time that epoll_wait() should wait for events on a given epoll context.
> Some justification and numbers are in patch 6, patches 1-5 are really
> just prep patches or cleanups.
>
> Sending this out to get some input on the API, basically. This is
> obviously a per-context type of operation in this patchset, which isn't
> necessarily ideal for any use case. Questions to be debated:
>
> 1) Would we want this to be available through epoll_wait() directly?
> That would allow this to be done on a per-epoll_wait() basis, rather
> than be tied to the specific context.
>
> 2) If the answer to #1 is yes, would we still want EPOLL_CTL_MIN_WAIT?
>
> I think there are pros and cons to both, and perhaps the answer to both is
> "yes". There are some benefits to doing this at epoll setup time, for
> example - it nicely isolates it to that part rather than needing to be
> done dynamically everytime epoll_wait() is called. This also helps the
> application code, as it can turn off any busy'ness tracking based on if
> the setup accepted EPOLL_CTL_MIN_WAIT or not.
>
> Anyway, tossing this out there as it yielded quite good results in some
> initial testing, we're running more of it. Sending out a v3 now since
> someone reported that nonblock issue which is annoying. Hoping to get some
> more discussion this time around, or at least some...

My main question is whether the cycle gains justify the code
complexity and runtime cost in all other epoll paths.

Syscall overhead is quite dependent on architecture and things like KPTI.

Indeed, I was also wondering whether an extra timeout arg to
epoll_wait would give the same feature with less side effects. Then no
need for that new ctrl API.