Re: [PATCH 02/10] sched: fix find_idlest_group mess logical

From: Alex Shi
Date: Thu Dec 06 2012 - 20:34:58 EST


On 12/07/2012 08:56 AM, Frederic Weisbecker wrote:
> 2012/12/3 Alex Shi <alex.shi@xxxxxxxxx>:
>> There is 4 situations in the function:
>> 1, no task allowed group;
>> so min_load = ULONG_MAX, this_load = 0, idlest = NULL
>> 2, only local group task allowed;
>> so min_load = ULONG_MAX, this_load assigned, idlest = NULL
>> 3, only non-local task group allowed;
>> so min_load assigned, this_load = 0, idlest != NULL
>> 4, local group + another group are task allowed.
>> so min_load assigned, this_load assigned, idlest != NULL
>>
>> Current logical will return NULL in first 3 kinds of scenarios.
>> And still return NULL, if idlest group is heavier then the
>> local group in the 4th situation.
>>
>> Actually, I thought groups in situation 2,3 are also eligible to host
>> the task. And in 4th situation, agree to bias toward local group.
>> So, has this patch.
>
> The way I understand the loop that use this in select_task_rq_fair() is:
>
> a) start from the highest domain level we are allowed to run to
> migrate the task in
> b) from that top level domain, find the idlest group. If the idlest
> group contains current CPU, zoom in the child domain and repeat b). If
> the idlest group doesn't contain the current CPU, pick the idlest CPU
> from that group.
> c) In the end if we found no idler target than current CPU, then take it.
>
> So if you also return a group that contains current CPU from
> find_idlest_group(), you don't recursively zoom in the child domain
> anymore. find_idlest_cpu() will fix that for you but it may come with
> some cost because now it iterates through every CPUs, or may be half
> of them.

Not exactly, the old logical won't select cpu from group of situation 2
and 3. That is wrong. and may cause the task keep stay on prev_cpu even
there are still other idler and allowed cpu exist.

situation 2,3 are also eligible for the task. and may has idler and
eligible cpu.

>
> The advantage of a recursive zooming through find_idlest_group() is to
> scale better with the number of CPUs. It's probably like O(log n)
> instead of O(n).
>
> But it's possible I misunderstood something.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/