[patch 05/26] sched_rt.c: resch needed in rt_rq_enqueue() for theroot rt_rq

From: Greg KH
Date: Sat Oct 18 2008 - 15:11:53 EST

2.6.26-stable review patch. If anyone has any objections, please let us

From: Dario Faggioli <raistlin@xxxxxxxx>

commit f6121f4f8708195e88cbdf8dd8d171b226b3f858 upstream

While working on the new version of the code for SCHED_SPORADIC I
noticed something strange in the present throttling mechanism. More
specifically in the throttling timer handler in sched_rt.c
(do_sched_rt_period_timer()) and in rt_rq_enqueue().

The problem is that, when unthrottling a runqueue, rt_rq_enqueue() only
asks for rescheduling if the runqueue has a sched_entity associated to
it (i.e., rt_rq->rt_se != NULL).
Now, if the runqueue is the root rq (which has a rt_se = NULL)
rescheduling does not take place, and it is delayed to some undefined
instant in the future.

This imply some random bandwidth usage by the RT tasks under throttling.
For instance, setting rt_runtime_us/rt_period_us = 950ms/1000ms an RT
task will get less than 95%. In our tests we got something varying
between 70% to 95%.
Using smaller time values, e.g., 95ms/100ms, things are even worse, and
I can see values also going down to 20-25%!!

The tests we performed are simply running 'yes' as a SCHED_FIFO task,
and checking the CPU usage with top, but we can investigate thoroughly
if you think it is needed.

Things go much better, for us, with the attached patch... Don't know if
it is the best approach, but it solved the issue for us.

Signed-off-by: Dario Faggioli <raistlin@xxxxxxxx>
Signed-off-by: Michael Trimarchi <trimarchimichael@xxxxxxxx>
Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>

kernel/sched_rt.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -96,12 +96,12 @@ static void dequeue_rt_entity(struct sch

static void sched_rt_rq_enqueue(struct rt_rq *rt_rq)
+ struct task_struct *curr = rq_of_rt_rq(rt_rq)->curr;
struct sched_rt_entity *rt_se = rt_rq->rt_se;

- if (rt_se && !on_rt_rq(rt_se) && rt_rq->rt_nr_running) {
- struct task_struct *curr = rq_of_rt_rq(rt_rq)->curr;
- enqueue_rt_entity(rt_se);
+ if (rt_rq->rt_nr_running) {
+ if (rt_se && !on_rt_rq(rt_se))
+ enqueue_rt_entity(rt_se);
if (rt_rq->highest_prio < curr->prio)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/