Re: [PATCH tip/core/urgent 3/3] sched: protect__sched_setscheduler() access to cgroups

From: Peter Zijlstra
Date: Thu Apr 22 2010 - 16:33:37 EST


On Thu, 2010-04-22 at 12:54 -0700, Paul E. McKenney wrote:
> A given task's cgroups structures must remain while that task is running
> due to reference counting, so this is presumably a false positive.
> Updated to reflect feedback from Tetsuo Handa.

I think its not a false positive, I think we can race with the task
being placed in another cgroup. We don't hold task_lock() [our other
discussion] nor does it hold rq->lock [used by the sched ->attach()
method].

That said, we should probably cure the race condition of
sched_setscheduler() vs ->attach().

Something like the below perhaps?

Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
---
kernel/sched.c | 38 ++++++++++++++++++++++++++------------
1 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 95eaecc..345df67 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4425,16 +4425,6 @@ recheck:
}

if (user) {
-#ifdef CONFIG_RT_GROUP_SCHED
- /*
- * Do not allow realtime tasks into groups that have no runtime
- * assigned.
- */
- if (rt_bandwidth_enabled() && rt_policy(policy) &&
- task_group(p)->rt_bandwidth.rt_runtime == 0)
- return -EPERM;
-#endif
-
retval = security_task_setscheduler(p, policy, param);
if (retval)
return retval;
@@ -4450,6 +4440,28 @@ recheck:
* runqueue lock must be held.
*/
rq = __task_rq_lock(p);
+ retval = 0;
+#ifdef CONFIG_RT_GROUP_SCHED
+ if (user) {
+ /*
+ * Do not allow realtime tasks into groups that have no runtime
+ * assigned.
+ *
+ * RCU read lock not strictly required but here for PROVE_RCU,
+ * the task is pinned by holding rq->lock which avoids races
+ * with ->attach().
+ */
+ rcu_read_lock();
+ if (rt_bandwidth_enabled() && rt_policy(policy) &&
+ task_group(p)->rt_bandwidth.rt_runtime == 0)
+ retval = -EPERM;
+ rcu_read_unlock();
+
+ if (retval)
+ goto unlock;
+ }
+#endif
+
/* recheck policy now with rq lock held */
if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
policy = oldpolicy = -1;
@@ -4477,12 +4489,14 @@ recheck:

check_class_changed(rq, p, prev_class, oldprio, running);
}
+unlock:
__task_rq_unlock(rq);
raw_spin_unlock_irqrestore(&p->pi_lock, flags);

- rt_mutex_adjust_pi(p);
+ if (!retval)
+ rt_mutex_adjust_pi(p);

- return 0;
+ return retval;
}

/**


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/