On Fri, Apr 14, 2006 at 11:50:10AM +1000, Peter Williams wrote:Siddha, Suresh B wrote:On Thu, Apr 13, 2006 at 04:57:15PM +1000, Peter Williams wrote:That's the option 4 case in my original mail. Are you suggesting that it would have been the better option to adopt? If so, why?Problem:with previous load balancer code it was a good point.. because all tasks
The move_tasks() function is designed to move UP TO the amount of load it is asked to move and in doing this it skips over tasks looking for ones whose load weights are less than or equal to the remaining load to be moved. This is (in general) a good thing but it has the unfortunate result of breaking one of the original load balancer's good points:
were weighted the same from load balancer perspective.. but now the
imbalance represents what task to move (atleast in the working
No. I was not suggesting option-4. With this change in move_tasks, we will be overriding the decision what ever we made while calculating imbalance.
Lets see for example, we have a simple DP system. With proc-0 running
one high priority and one low priority task, Proc-1 running one
low priority task. Ideally we would like to move low priority task from
P0 to P1. But with this patch, we may end up moving high priority task
from P0 to P1. But slowly after sometime(depending on high priority task
is on active/expired list), we will converge to the expected
you mean the highest priority task on the current active list of the new run queue, right?Good point. this_min_prio should probably be initialized to the minimum of this_rq->curr->prio and this_rq->best_expired_prio rather just using this_rq->curr->prio.
There will be some unnecessary movements of high priority tasks because ofHow so. At most one task per move_tasks() will be moved as a result of this code that wouldn't have been moved otherwise. That task will be a high priority task stuck behind a higher priority task on its current queue that will be the highest priority on its new queue causing a preempt and access to the CPU. I wouldn't call this unnecessary.
highest priority task can be in the expired list with normal priority
task running.. as in my above example.
Peter, Are you sure that this is a converging solution? If we want toYes, I think we're getting there.
I think we need changes to try_to_wake_up() to help high priority tasks find idle CPUs or CPUs where they would preempt when they wake up. Leaving this to the load balancer is bad for these tasks latencies. I think that this change needs to be done independently of smpnice as it would be useful even without smpnice. I'm hoping Ingo or Nick will comment on this proposal.
It would also help if you fixed the active load balance code so that it's not necessary to distort normal load balancing to accommodate it. I haven't had time to look at it myself (other than a quick glance) yet.
The only special check in find_busiest_group() helping MT/MC balancing
is pwr_now and pwr_move calculations..
These calculations will also help,
in future when we are dealing with sched groups which are not symmetric.
Asymmetries can be caused in scenarios like cpufreq, cpu logical hotplug..
I think we are unnecessarily behind active load balance...