Re: [PATCH RESEND v2] fs/epoll: Remove unnecessary wakeups of nested epoll that in ET mode

From: Jason Baron
Date: Tue Sep 24 2019 - 11:19:58 EST




On 9/24/19 10:06 AM, Heiher wrote:
> Hi,
>
> On Mon, Sep 23, 2019 at 11:34 PM Jason Baron <jbaron@xxxxxxxxxx> wrote:
>>
>>
>>
>> On 9/20/19 12:00 PM, Jason Baron wrote:
>>> On 9/19/19 5:24 AM, hev wrote:
>>>> From: Heiher <r@xxxxxx>
>>>>
>>>> Take the case where we have:
>>>>
>>>> t0
>>>> | (ew)
>>>> e0
>>>> | (et)
>>>> e1
>>>> | (lt)
>>>> s0
>>>>
>>>> t0: thread 0
>>>> e0: epoll fd 0
>>>> e1: epoll fd 1
>>>> s0: socket fd 0
>>>> ew: epoll_wait
>>>> et: edge-trigger
>>>> lt: level-trigger
>>>>
>>>> When s0 fires an event, e1 catches the event, and then e0 catches an event from
>>>> e1. After this, There is a thread t0 do epoll_wait() many times on e0, it should
>>>> only get one event in total, because e1 is a dded to e0 in edge-triggered mode.
>>>>
>>>> This patch only allows the wakeup(&ep->poll_wait) in ep_scan_ready_list under
>>>> two conditions:
>>>>
>>>> 1. depth == 0.


What is the point of this condition again? I was thinking we only need
to do #2.

>>>> 2. There have event is added to ep->ovflist during processing.
>>>>
>>>> Test code:
>>>> #include <unistd.h>
>>>> #include <sys/epoll.h>
>>>> #include <sys/socket.h>
>>>>
>>>> int main(int argc, char *argv[])
>>>> {
>>>> int sfd[2];
>>>> int efd[2];
>>>> struct epoll_event e;
>>>>
>>>> if (socketpair(AF_UNIX, SOCK_STREAM, 0, sfd) < 0)
>>>> goto out;
>>>>
>>>> efd[0] = epoll_create(1);
>>>> if (efd[0] < 0)
>>>> goto out;
>>>>
>>>> efd[1] = epoll_create(1);
>>>> if (efd[1] < 0)
>>>> goto out;
>>>>
>>>> e.events = EPOLLIN;
>>>> if (epoll_ctl(efd[1], EPOLL_CTL_ADD, sfd[0], &e) < 0)
>>>> goto out;
>>>>
>>>> e.events = EPOLLIN | EPOLLET;
>>>> if (epoll_ctl(efd[0], EPOLL_CTL_ADD, efd[1], &e) < 0)
>>>> goto out;
>>>>
>>>> if (write(sfd[1], "w", 1) != 1)
>>>> goto out;
>>>>
>>>> if (epoll_wait(efd[0], &e, 1, 0) != 1)
>>>> goto out;
>>>>
>>>> if (epoll_wait(efd[0], &e, 1, 0) != 0)
>>>> goto out;
>>>>
>>>> close(efd[0]);
>>>> close(efd[1]);
>>>> close(sfd[0]);
>>>> close(sfd[1]);
>>>>
>>>> return 0;
>>>>
>>>> out:
>>>> return -1;
>>>> }
>>>>
>>>> More tests:
>>>> https://github.com/heiher/epoll-wakeup
>>>>
>>>> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
>>>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>>>> Cc: Davide Libenzi <davidel@xxxxxxxxxxxxxxx>
>>>> Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx>
>>>> Cc: Dominik Brodowski <linux@xxxxxxxxxxxxxxxxxxxx>
>>>> Cc: Eric Wong <e@xxxxxxxxx>
>>>> Cc: Jason Baron <jbaron@xxxxxxxxxx>
>>>> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>>>> Cc: Roman Penyaev <rpenyaev@xxxxxxx>
>>>> Cc: Sridhar Samudrala <sridhar.samudrala@xxxxxxxxx>
>>>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>>>> Cc: linux-fsdevel@xxxxxxxxxxxxxxx
>>>> Signed-off-by: hev <r@xxxxxx>
>>>> ---
>>>> fs/eventpoll.c | 5 ++++-
>>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
>>>> index c4159bcc05d9..fa71468dbd51 100644
>>>> --- a/fs/eventpoll.c
>>>> +++ b/fs/eventpoll.c
>>>> @@ -685,6 +685,9 @@ static __poll_t ep_scan_ready_list(struct eventpoll *ep,
>>>> if (!ep_locked)
>>>> mutex_lock_nested(&ep->mtx, depth);
>>>>
>>>> + if (!depth || list_empty_careful(&ep->rdllist))
>>>> + pwake++;
>>>> +

This is the check I'm wondering why it's needed?

Thanks,


-Jason