Re: High CPU load when machine is idle (related to PROBLEM: Unusuallyhigh load average when idle in 2.6.35, 2.6.35.1 and later)

From: Venkatesh Pallipadi
Date: Thu Oct 21 2010 - 13:19:09 EST


On Thu, Oct 21, 2010 at 5:09 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Wed, 2010-10-20 at 19:26 +0200, Peter Zijlstra wrote:
>
>> -static void calc_load_account_idle(struct rq *this_rq)
>> +void calc_load_account_idle(void)
>>  {
>> +     struct rq *this_rq = this_rq();
>>       long delta;
>>
>>       delta = calc_load_fold_active(this_rq);
>> +     this_rq->calc_load_inactive = delta;
>> +     this_rq->calc_load_seq = atomic_read(&calc_load_seq);
>> +
>>       if (delta)
>>               atomic_long_add(delta, &calc_load_tasks_idle);
>>  }
>>
>> +void calc_load_account_nonidle(void)
>> +{
>> +     struct rq *this_rq = this_rq();
>> +
>> +     if (atomic_read(&calc_load_seq) == this_rq->calc_load_seq) {
>> +             atomic_long_sub(this_rq->calc_load_inactive, &calc_load_tasks_idle);
>> +             /*
>> +              * Undo the _fold_active() from _account_idle(). This
>> +              * avoids us loosing active tasks and creating a negative
>> +              * bias
>> +              */
>> +             this_rq->calc_load_active -= this_rq->calc_load_inactive;
>> +     }
>> +}
>
> Ok, so while trying to write a changelog on this patch I got myself
> terribly confused again..
>
> calc_load_active_fold() is a relative operation and simply gives delta
> values since the last time it got called. That means that the sum of
> multiple invocations in a given time interval should be identical to a
> single invocation.
>
> Therefore, the going idle multiple times during LOAD_FREQ hypothesis
> doesn't really make sense.
>

Yes. Thats what I was thinking trying to understand this code yesterday.

Also with sequence number I don't think nr_interruptible would be
handled correctly
as tasks can move to CPU after it first went idle and may not get
accounted later.

I somehow feel the problem is with nr_interruptible, which gets
accounted multiple
times on idle tasks and only once per LOAD_FREQ on busy tasks.
However, things are
not fully clear to me yet. Have to look at the code a bit more.

Thanks,
Venki

> Even if it became idle but wasn't idle at the LOAD_FREQ turn-over it
> shouldn't matter, since the calc_load_account_active() call will simply
> fold the remaining delta with the accrued idle delta and the total
> should all match up once we fold into the global calc_load_tasks.
>
> So afaict its should all have worked and this patch is a big NOP,.
> except it isn't..
>
> Damn I hate this bug.. ;-) Anybody?
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/