Re: Strange issues with epoll since 5.0

From: Eric Wong
Date: Wed Apr 24 2019 - 15:39:05 EST


Omar Kilani <omar.kilani@xxxxxxxxx> wrote:
> Hi there,
>
> Iâm still trying to piece together a reproducible test that triggers
> this, but I wanted to post in case someone goes âhmmm... change X
> might have done thisâ.

Maybe Davidlohr knows, since he's responsible for most of the
epoll changes in 5.0.

> Basically, somethingâs broken (or at least, has changed enough to
> cause problems in user space) in epoll since 5.0. Itâs still broken in
> 5.1-rc5.
>
> It doesnât happen 100% of the time. Itâs sort of hard to pin down but
> Iâve observed the following:
>
> * nginx not accepting connections under load
> * A java app which uses netty / NIO having strange writability
> semantics on channels, which confuses netty / java enough to not
> properly flush written data on the socket.
>
> I went and tested these Linux kernels:
>
> 4.20.17
> 4.19.32
> 4.14.111
>
> And the issue(s) do not show up there.
>
> Iâm still actively chasing this up, and will report back â I havenât
> touched kernel code in 15 years so Iâm a little rusty. :)
>
> Regards,
> Omar