[RT PATCH v2 2/2] RT: remove "paranoid" limit in push_rt_task

From: Gregory Haskins
Date: Fri Oct 03 2008 - 13:22:40 EST


A panic was discovered by Chirag Jog and investigated by Gilles Carry
to be originating in the fact that a task being pushed away
may get migrated away during a double_lock_balance. The result was
that the pushable_tasks list may become corrupted.

The root cause is that the "paranoid" retry limit could cause us to
bail out of a retry, but still try to remove the item from the (now
potentially incorrect) list. There are numerous ways to correct the
condition, but the paranoid feature is no longer relevant with the new
pushable logic (since pushable naturally limits the loop anyway), so
lets just remove it.

Reported By: Chirag Jog <chirag@xxxxxxxxxxxxxxxxxx>
Found-by: Gilles Carry <gilles.carry@xxxxxxxx>
Signed-off-by: Gregory Haskins <ghaskins@xxxxxxxxxx>
---

kernel/sched_rt.c | 34 ++++++++++++++++++++++------------
1 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index 59ead84..201bd97 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -1056,7 +1056,6 @@ static int push_rt_task(struct rq *rq)
{
struct task_struct *next_task;
struct rq *lowest_rq;
- int paranoid = RT_MAX_TRIES;

if (!rq->rt.overloaded)
return 0;
@@ -1090,23 +1089,34 @@ static int push_rt_task(struct rq *rq)
struct task_struct *task;
/*
* find lock_lowest_rq releases rq->lock
- * so it is possible that next_task has changed.
- * If it has, then try again.
+ * so it is possible that next_task has migrated.
+ *
+ * We need to make sure that the task is still on the same
+ * run-queue and is also still the next task eligible for
+ * pushing.
*/
task = pick_next_pushable_task(rq);
- if (unlikely(task != next_task) && task && paranoid--) {
- put_task_struct(next_task);
- next_task = task;
- goto retry;
+ if (task_cpu(next_task) == rq->cpu && task == next_task) {
+ /*
+ * If we get here, the task hasnt moved it all, but
+ * it has failed to push. We will not try again,
+ * since the other cpus will pull from us when they
+ * are ready.
+ */
+ dequeue_pushable_task(rq, next_task);
+ goto out;
}
+
+ if (!task)
+ /* No more tasks, just exit */
+ goto out;

/*
- * Once we have failed to push this task, we will not
- * try again, since the other cpus will pull from us
- * when they are ready
+ * Something has shifted, try again.
*/
- dequeue_pushable_task(rq, next_task);
- goto out;
+ put_task_struct(next_task);
+ next_task = task;
+ goto retry;
}

deactivate_task(rq, next_task, 0);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/