Re: [patch] fs, epoll: short circuit fetching events if thread has been killed

From: David Rientjes
Date: Tue May 09 2017 - 20:08:28 EST


On Tue, 9 May 2017, Andrew Morton wrote:

> > We've encountered zombies that are waiting for a thread to exit that are
> > looping in ep_poll() almost endlessly although there is a pending SIGKILL
> > as a result of a group exit.
> >
> > This happens because we always find ep_events_available() and fetch more
> > events and never are able to check for signal_pending() that would break
> > from the loop and return -EINTR.
> >
> > Special case fatal signals and break immediately to guarantee that we
> > loop to fetch more events and delay making a timely exit.
> >
> > It would also be possible to simply move the check for signal_pending()
> > higher than checking for ep_events_available(), but there have been no
> > reports of delayed signal handling other than SIGKILL preventing zombies
> > from exiting that would be fixed by this.
>
> Any thoughts on the priority of this? -stable? If so, why?
>

It fixes an issue for us where we have witnessed zombies sticking around
for at least O(minutes), but considering the code has been like this
forever and nobody else has complained that I have found, I would simply
queue it up for 4.12.