Re: [patch v3 0/8] sched: use runnable avg in load balance

From: Alex Shi
Date: Wed Apr 03 2013 - 00:28:49 EST


On 04/03/2013 11:23 AM, Michael Wang wrote:
> On 04/03/2013 10:56 AM, Alex Shi wrote:
>> On 04/03/2013 10:46 AM, Michael Wang wrote:
>>> | 15 GB | 16 | 45110 | | 48091 |
>>> | 15 GB | 24 | 41415 | | 47415 |
>>> | 15 GB | 32 | 35988 | | 45749 | +27.12%
>>>
>>> Very nice improvement, I'd like to test it with the wake-affine throttle
>>> patch later, let's see what will happen ;-)
>>>
>>> Any idea on why the last one caused the regression?
>>
>> you can change the burst threshold: sysctl_sched_migration_cost, to see
>> what's happen with different value. create a similar knob and tune it.
>> +
>> + if (cpu_rq(this_cpu)->avg_idle < sysctl_sched_migration_cost)
>> + burst_this = 1;
>> + if (cpu_rq(prev_cpu)->avg_idle < sysctl_sched_migration_cost)
>> + burst_prev = 1;
>> +
>>
>>
>
> This changing the rate of adopt cpu_rq(cpu)->load.weight, correct?
>
> So if rq is busy, cpu_rq(cpu)->load.weight is capable enough to stand
> for the load status of rq? what's the really idea here?

This patch try to resolved the aim7 liked benchmark regression.
If many tasks sleep long time, their runnable load are zero. And then if
they are waked up bursty, too light runnable load causes big imbalance in
select_task_rq. So such benchmark, like aim9 drop 5~7%.

this patch try to detect the burst, if so, it use load weight directly not
zero runnable load avg to avoid the imbalance.

but the patch may cause some unfairness if this/prev cpu are not burst at
same time. So could like try the following patch?