Re: [kernel-hardening] [PATCH] move RLIMIT_NPROC check from set_user() to do_execve_common()

From: Solar Designer
Date: Thu Jul 21 2011 - 15:40:00 EST

On Thu, Jul 21, 2011 at 11:21:07AM -0700, Linus Torvalds wrote:
> I think we could have a pretty simple approach that "works in
> practice": retain the check at setuid() time, but make it a higher
> limit.
> IOW, the logic is that we have two competing pressures:
> (a) we should try to avoid failing on setuid(), because there is a
> real risk that the setuid caller doesn't really check the failure case
> and opens itself up for a security problem
> and
> (b) never failing setuid at all is in itself a security problem,
> since it can lead to DoS attacks in the form of excessive resource use
> by one user.

I don't recall anyone stating (b) the way you did above (or sufficiently
similar). I wouldn't consider setuid() never failing to be a security
problem. BTW, some people consider setuid() failing on RLIMIT_NPROC
kernel "brokenness", which applications have to "work around":

"I'm aware of course that some interfaces *can* fail for nonstandard
reasons under Linux (dup2 and set*uid come to mind), and I've tried to
work around these and shield applications from the brokenness..."

So opinions on setuid() failing vary, whereas (a) is clear - there have
been vulnerabilities caused by setuid() failing.

The DoS that you mention doesn't necessarily have to be dealt with by
setuid() failing on RLIMIT_NPROC (nor on a higher limit). It could also
be dealt with by checking the limit on execve(), like we've been doing
on shared web hosting servers for years, and, if desired, by letting
applications like Android/Zygote check for the condition themselves via
a new prctl() (or they can simply pass an extra fork(), although that's
a bit costly).

> IOW, I'd suggest simply making the rule be that "setuid() allows 10%
> more users than the limit technically says". It's not a guarantee, but
> it means that in order to hit the problem, you need to have *both* a
> setuid application that allows unconstrained user forking *and*
> doesn't check the setuid() return value.
> Put another way: a user cannot force the "we're at the edge of the
> setuid() limit" on its own by just forking - the user will be stopped
> 10% before the setuid() failure case can ever trigger.

For a malicious user, this merely adds the task of triggering a race
condition - have a sufficient number of processes accumulate in the
between setuid() and execve() state. If the program in question can be
made to sleep, this may be trivial to do. Otherwise, it may require
very rapid requests (automated) and high system load.

(BTW, 10% of 0 would be 0, which would allow for attacks that are as
simple as they're now, but that's an implementation detail. Of course,
you'd actually add some constant as well.)

> Is this some "guarantee of nothing bad can ever happen"? No. If you
> have bad setuid applications, you will have problems. But it's a "you
> need to really work harder at it and you need to find more things to
> go wrong", which is after all what real security is all about.
> No?

I generally support having multiple layers of security even if some are
non-perfect, but in this case we have a problem that we can _fully_ deal
with rather than merely make attacks harder.

So my proposal remains to go with Vasiliy's trivial patch and maybe add
a few things on top of it as I mentioned in my previous message.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at