Re: [REGRESSION 2.6.30][PATCH v3] sched: update load count only once per cpu in 10 tick update window

From: Chase Douglas
Date: Thu Apr 22 2010 - 11:35:37 EST


On Thu, Apr 22, 2010 at 9:18 AM, Chase Douglas
<chase.douglas@xxxxxxxxxxxxx> wrote:
> On Thu, Apr 22, 2010 at 7:08 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> On Tue, 2010-04-13 at 16:19 -0700, Chase Douglas wrote:
>>>
>>> There's a period of 10 ticks where calc_load_tasks is updated by all the
>>> cpus for the load avg. Usually all the cpus do this during the first
>>> tick. If any cpus go idle, calc_load_tasks is decremented accordingly.
>>> However, if they wake up calc_load_tasks is not incremented. Thus, if
>>> cpus go idle during the 10 tick period, calc_load_tasks may be
>>> decremented to a non-representative value. This issue can lead to
>>> systems having a load avg of exactly 0, even though the real load avg
>>> could theoretically be up to NR_CPUS.
>>>
>>> This change defers calc_load_tasks accounting after each cpu updates the
>>> count until after the 10 tick update window.
>>>
>>> A few points:
>>>
>>> * A global atomic deferral counter, and not per-cpu vars, is needed
>>>   because a cpu may go NOHZ idle and not be able to update the global
>>>   calc_load_tasks variable for subsequent load calculations.
>>> * It is not enough to add calls to account for the load when a cpu is
>>>   awakened:
>>>   - Load avg calculation must be independent of cpu load.
>>>   - If a cpu is awakend by one tasks, but then has more scheduled before
>>>     the end of the update window, only the first task will be accounted.
>>>
>>
>> Ok, so delaying the whole ILB angle for now, the below is a similar
>> approach to yours but with a more explicit code flow.
>>
>> Does that work for you?
>
> This looks good. I'll run my test case to make sure it fixes the
> scenario we hit, and then I'll ack it when I've confirmed it works.

I've run my test case and it seems to push the load avg numbers as expected.

Acked-by: Chase Douglas <chase.douglas@xxxxxxxxxxxxx>

BTW, I noticed some trailing whitespace, so I ran it through checkpatch.pl:

ERROR: trailing whitespace
#44: FILE: kernel/sched.c:2936:
+ $

Thanks

-- Chase
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/