Re: [RFC PATCH v2 3/6] sched: pack small tasks

From: Alex Shi
Date: Thu Dec 13 2012 - 09:25:20 EST


On 12/13/2012 06:11 PM, Vincent Guittot wrote:
> On 13 December 2012 03:17, Alex Shi <alex.shi@xxxxxxxxx> wrote:
>> On 12/12/2012 09:31 PM, Vincent Guittot wrote:
>>> During the creation of sched_domain, we define a pack buddy CPU for each CPU
>>> when one is available. We want to pack at all levels where a group of CPU can
>>> be power gated independently from others.
>>> On a system that can't power gate a group of CPUs independently, the flag is
>>> set at all sched_domain level and the buddy is set to -1. This is the default
>>> behavior.
>>> On a dual clusters / dual cores system which can power gate each core and
>>> cluster independently, the buddy configuration will be :
>>>
>>> | Cluster 0 | Cluster 1 |
>>> | CPU0 | CPU1 | CPU2 | CPU3 |
>>> -----------------------------------
>>> buddy | CPU0 | CPU0 | CPU0 | CPU2 |
>>>
>>> Small tasks tend to slip out of the periodic load balance so the best place
>>> to choose to migrate them is during their wake up. The decision is in O(1) as
>>> we only check again one buddy CPU
>>
>> Just have a little worry about the scalability on a big machine, like on
>> a 4 sockets NUMA machine * 8 cores * HT machine, the buddy cpu in whole
>> system need care 64 LCPUs. and in your case cpu0 just care 4 LCPU. That
>> is different on task distribution decision.
>
> The buddy CPU should probably not be the same for all 64 LCPU it
> depends on where it's worth packing small tasks

Do you have further ideas for buddy cpu on such example?
>
> Which kind of sched_domain configuration have you for such system ?
> and how many sched_domain level have you ?

it is general X86 domain configuration. with 4 levels,
sibling/core/cpu/numa.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/