Re: [patch 016/104] epoll: introduce resource usage limits

From: Bron Gondwana
Date: Fri Jan 23 2009 - 22:50:51 EST

On Fri, Jan 23, 2009 at 09:06:31AM -0800, Greg KH wrote:
> On Fri, Jan 23, 2009 at 08:47:45PM +1100, Bron Gondwana wrote:
> > On Thu, 22 Jan 2009 21:16 -0800, "Greg KH" <gregkh@xxxxxxx> wrote:
> > >
> > > I would suggest just changing this default value then, it's a simple
> > > userspace configuration item, and for your boxes, it sounds like a
> > > larger value would be more suitable.

If everyone, or every distribution at least, has to change it then the
default is probably wrong. The error message in the postfix logs didn't
immediately point me at the issue, especially since I tried debugging on
one of our "production" mxes, only to discover that the epoll limit
didn't exist there. They're slightly behind in kernel versions.

> > I guess Postfix is a bit of an odd case here. It runs lots of
> > processes, yet uses epoll within many of them as well - sort of
> > a historical design in some ways, but also to enforce maximum
> > privilege separation with many of the daemons able to
> > be run under chroot with limited capabilities.
> >
> > So I guess I have a few questions left:
> >
> > 1) is this value ever supposed to be hit in practice by
> > non-malicious software? If not, it appears 128 is too low.
> It does appear a bit low. What looks to you like a good value to use as
> a default?

This thread suggests that it's not just postfix having the issue, and
offers 1024 as a saner default:

There's also a Russian thread that pointed me at this patch in the first
place, and another place that suggested 1024 as well. Seems "the
cloud"[tm] is converging on 1024.

> > 2) if we're going to stick with 128, is there any way to query
> > the kernel as to how close to the limit it's getting? As an
> > example, our system checks poll /proc/sys/fs/file-max every
> > 2 minutes, and warn us if its getting "full".
> Good idea, we should report this somewhere for the very reasons you
> suggest. Can you write up a patch to do this? If not, I'll see what I
> can do.

I'll have a look at it. There are two main choices I think - either one
file with just the "max", or some data view that shows all the users'
counts. It looks like it will have to enumerate the user list anyway.
(I've been poking around in kernel/user.c. Looks like a
hlist_for_each_entry on uidhash_table will do the trick. I'm guessing
we only want to display the value for the current user_ns anyway. I
don't really understand the user namespacing stuff, since I've never
used it)

Most of all I'm interested in this because if it's a good way to
actually have some viable statistics on what the default vaule
should be.

Bron ( still learning my way around the kernel - I've only written one
patch before, and it had a lot of babysitting from Linus! )
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at