Re: [PATCH 2/2] core: allow setrlimit to non-current tasks

From: Oleg Nesterov
Date: Wed Sep 02 2009 - 09:54:49 EST

On 09/02, Jiri Slaby wrote:
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1240,20 +1240,28 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
> struct rlimit *new_rlim)
> {
> struct rlimit *old_rlim;
> + unsigned long flags;
> int retval;
> if (new_rlim->rlim_cur > new_rlim->rlim_max)
> return -EINVAL;
> +
> + if (lock_task_sighand(tsk, &flags) == NULL)
> + return -ESRCH;

No, sorry, this can't work.

Because we need task_lock() to update rlimits, and ->alloc_lock does not
nest under ->siglock.

Looks like we have to use tasklist_lock, but please don't use _irq, and
please do not check ->signal != NULL. Perhaps it makes sense to take
tasklist only if !same_thread_group(tsk, current) though.

Oh. We really need to make ->signal refcountable.

But there is another minor problem. If we use read_lock(ttasklist), then
the write to /proc/application_pid/limits can race with application doing

Nothing bad can happen, but this means that "echo ... > /proc/limits" can
be lost. Not good, if admin wants to lower ->rlim_max we should try to ensure
this always works.


