On Wed, Apr 12, 2006 at 03:06:52PM +1000, Peter Williams wrote:Siddha, Suresh B wrote:Is there an example for this?Yes, we just take a slight variation of your scenario that prompted the first patch (to which this patch is a minor modification) by adding one normal priority task to each of the CPUs. This gives us a 2 CPU system with CPU-0 having 2 high priority tasks plus 1 normal priority task and CPU-1 having two normal priority tasks. Clearly, the desirable load balancing outcome would be for the two high priority tasks to be on different CPUs otherwise we have a high priority task stuck on a run queue while a normal priority is running on another (less heavily loaded) CPU.
In order to analyze what happens during load balancing, let's use W as the load weight for a normal task and suppose that the load weights of the two high priority tasks are (W + k) and that "this" == CPU-1 in find_busiest_queue(). This will result in "busiest" == CPU-0 and:
this_load = 2W
this_load_per_task = W
max_load = 3W + 2k
busiest_load_per_task = W + 2k / 3
avg_load = 5W / 2 + k
max_pull = W / 2 + k
*imbalance = W / 2 + k
Whenever k < (3W / 2) this will result in *imbalance < busiest_load_per_task and we end up in the small imbalance code.
(max_load - this_load) = W + 2k which is greater than busiest_load_per_task so we decide that we want to move some load from "busiest" to "this".
Without this patch we would set *imbalance to busiest_load_per_task and the only task on "busiest" that has a weighted load less than or equal to this value is the normal task so this is the one that will be moved resulting:
this_load = 3W
this_load_per_task = W
max_load = 2W + 2k
busiest_load_per_task = W + k
Even if you reverse the roles of "busiest" and "this", this will be considered balanced and the system will stabilize in this undesirable state. NB, as predicted, the average load per task on "this" hasn't changed and the average load per task on "busiest" has increased. We still have the situation where a high priority task is stuck on a run queue while a low priority task is running on another CPU -- we've failed :-(.
for such a 'k' value, we fail anyhow. For example, how does the normal
load balance detect an imbalance in this below situation?
this_load = 3W
this_load_per_task = W
max_load = 2W + 2k
busiest_load_per_task = W + k
if we really want to distribute 'N' higher priority tasks(however small or
big is the priority difference between low and high priority tasks) on to 'N' different cpus, we will need really different approach for load balancing..