Re: [PATCH] sched: update cpupri for runqueue when its priority changes

From: Hillf Danton
Date: Sat Jun 18 2011 - 10:54:51 EST

On Fri, Jun 17, 2011 at 10:56 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> On Fri, 2011-06-17 at 20:59 +0800, Hillf Danton wrote:
>> On Fri, Jun 17, 2011 at 9:50 AM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>> > On Sun, 2011-06-05 at 17:54 +0800, Hillf Danton wrote:
>> >> When the priority of runqueue changes, lower or higer, the info of cpupri
>> >> should be updated, in cases such as pick_next_task_rt() and switched_to_rt().
>> >
>> > Why?
>> >
>> > We do the calculation on queuing and dequeuing the task, we only care
>> > about the highest priority task that is on the queue, not what is
>> > actually running.
>> >
>> It is to capture the changes in CPU priority caused by re-queued task and
>> throttled RQ.

Hi Steven,

Thanks for reviewing the patch.

> OK, I talked a little with Peter about this. We don't throttle an rq, we
> throttle a group. A group consists of tasks, not rqs. When a group is
> throttled, we do not migrate tasks, so the cpupri is not a issue here.
> For non throttled groups, tasks are enqueued and when they are, the
> cpupri is updated. We *only* care about tasks that are enqueued.
> Thus, lets look again at your patch:
>> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
>> index 08e9374..9508168 100644
>> --- a/kernel/sched_rt.c
>> +++ b/kernel/sched_rt.c
>> @@ -1158,6 +1158,8 @@ static struct task_struct
>> *pick_next_task_rt(struct rq *rq)
>> Â Â Â Â Â* lock again later if there is no need to push
>> Â Â Â Â Â*/
>> Â Â Â Â rq->post_schedule = has_pushable_tasks(rq);
>> +
>> + Â Â Â cpupri_set(&rq->rd->cpupri, rq->cpu, p == NULL ? MAX_RT_PRIO : p->prio);
> In pick_next_task_rt(), p is the highes prio that is queued. Thus,
> cpupri is already set to p->prio. If p is NULL, then there is no rt
> tasks queued on this rq, and cpupri is set to MAX_RT_PRIO. Your patch
> here does not change anything.

There are two cases that NULL is returned in _pick_next_task_rt(), it is the
second case, after checking rt_rq->rt_nr_running, that is captured, and if
NULL is returned in the second case, the CPU priority does change.

In another scenario that has little with {en, de}queue, as shown by
requeue_task_rt(), the CPU priority will change if other RT tasks exist.

>> Â#endif
>> Â Â Â Â return p;
>> @@ -1673,6 +1675,8 @@ static void switched_to_rt(struct rq *rq, struct
>> task_struct *p)
>> Â{
>> Â Â Â Â int check_resched = 1;
>> + Â Â Â if (!p->on_rq)
>> + Â Â Â Â Â Â Â return;
>> Â Â Â Â /*
>> Â Â Â Â Â* If we are already running, then there's nothing
>> Â Â Â Â Â* that needs to be done. But if we are not running
>> @@ -1680,7 +1684,7 @@ static void switched_to_rt(struct rq *rq, struct
>> task_struct *p)
>> Â Â Â Â Â* If that current running task is also an RT task
>> Â Â Â Â Â* then see if we can move to another run queue.
>> Â Â Â Â Â*/
>> - Â Â Â if (p->on_rq && rq->curr != p) {
>> + Â Â Â if (rq->curr != p) {
>> Â#ifdef CONFIG_SMP
>> Â Â Â Â Â Â Â Â if (rq->rt.overloaded && push_rt_task(rq) &&
>> Â Â Â Â Â Â Â Â Â Â /* Don't resched if we changed runqueues */
>> @@ -1690,6 +1694,11 @@ static void switched_to_rt(struct rq *rq,
>> struct task_struct *p)
>> Â Â Â Â Â Â Â Â if (check_resched && p->prio < rq->curr->prio)
>> Â Â Â Â Â Â Â Â Â Â Â Â resched_task(rq->curr);
>> Â Â Â Â }
>> + Â Â Â else {
>> +#ifdef CONFIG_SMP
>> + Â Â Â Â Â Â Â cpupri_set(&rq->rd->cpupri, rq->cpu, p->prio);
>> +#endif
> switched_to_rt() is called from sched.c's check_class_changed(), which
> is always called after enqueuing the task if p->on_rq was set. Thus, if
> this is running and is the highest priority task, cpupri would have this
> bit set too. Again, your patch does nothing but add more overhead.

The patch is overhead at this hunk.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at