Re: [PATCH tip/core/urgent 3/3] sched: protect__sched_setscheduler() access to cgroups

From: Paul E. McKenney
Date: Thu Apr 22 2010 - 17:25:56 EST


On Thu, Apr 22, 2010 at 10:33:18PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-04-22 at 12:54 -0700, Paul E. McKenney wrote:
> > A given task's cgroups structures must remain while that task is running
> > due to reference counting, so this is presumably a false positive.
> > Updated to reflect feedback from Tetsuo Handa.
>
> I think its not a false positive, I think we can race with the task
> being placed in another cgroup. We don't hold task_lock() [our other
> discussion] nor does it hold rq->lock [used by the sched ->attach()
> method].

Ah, I am dropping this patch then.

Ingo, please accept my apologies for the confusion submitting it too soon!

Thanx, Paul

> That said, we should probably cure the race condition of
> sched_setscheduler() vs ->attach().
>
> Something like the below perhaps?
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> ---
> kernel/sched.c | 38 ++++++++++++++++++++++++++------------
> 1 files changed, 26 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 95eaecc..345df67 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -4425,16 +4425,6 @@ recheck:
> }
>
> if (user) {
> -#ifdef CONFIG_RT_GROUP_SCHED
> - /*
> - * Do not allow realtime tasks into groups that have no runtime
> - * assigned.
> - */
> - if (rt_bandwidth_enabled() && rt_policy(policy) &&
> - task_group(p)->rt_bandwidth.rt_runtime == 0)
> - return -EPERM;
> -#endif
> -
> retval = security_task_setscheduler(p, policy, param);
> if (retval)
> return retval;
> @@ -4450,6 +4440,28 @@ recheck:
> * runqueue lock must be held.
> */
> rq = __task_rq_lock(p);
> + retval = 0;
> +#ifdef CONFIG_RT_GROUP_SCHED
> + if (user) {
> + /*
> + * Do not allow realtime tasks into groups that have no runtime
> + * assigned.
> + *
> + * RCU read lock not strictly required but here for PROVE_RCU,
> + * the task is pinned by holding rq->lock which avoids races
> + * with ->attach().
> + */
> + rcu_read_lock();
> + if (rt_bandwidth_enabled() && rt_policy(policy) &&
> + task_group(p)->rt_bandwidth.rt_runtime == 0)
> + retval = -EPERM;
> + rcu_read_unlock();
> +
> + if (retval)
> + goto unlock;
> + }
> +#endif
> +
> /* recheck policy now with rq lock held */
> if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
> policy = oldpolicy = -1;
> @@ -4477,12 +4489,14 @@ recheck:
>
> check_class_changed(rq, p, prev_class, oldprio, running);
> }
> +unlock:
> __task_rq_unlock(rq);
> raw_spin_unlock_irqrestore(&p->pi_lock, flags);
>
> - rt_mutex_adjust_pi(p);
> + if (!retval)
> + rt_mutex_adjust_pi(p);
>
> - return 0;
> + return retval;
> }
>
> /**
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/