Re: [patch v3 6/8] sched: consider runnable load average in move_tasks

From: Vincent Guittot
Date: Tue Apr 09 2013 - 11:16:34 EST


On 9 April 2013 16:48, Alex Shi <alex.shi@xxxxxxxxx> wrote:
> On 04/09/2013 07:56 PM, Vincent Guittot wrote:
>> On 9 April 2013 12:38, Alex Shi <alex.shi@xxxxxxxxx> wrote:
>>> On 04/09/2013 04:58 PM, Vincent Guittot wrote:
>>>>>>>> How do you ensure that runnable_avg_period and runnable_avg_sum are
>>>>>>>> coherent ? an update of the statistic can occur in the middle of your
>>>>>>>> sequence.
>>>>>>
>>>>>> Thanks for your question, Vincent!
>>>>>> the runnable_avg_period and runnable_avg_sum, only updated in
>>>>>> __update_entity_runnable_avg().
>>>>>> Yes, I didn't see some locks to ensure the coherent of them. but they
>>>>>> are updated closely, and it is not big deal even a little incorrect to
>>>>>> their value. These data are collected periodically, don't need very very
>>>>>> precise at every time.
>>>>>> Am I right? :)
>>>> The problem mainly appears during starting phase (the 1st 345ms) when
>>>> runnable_avg_period has not reached the max value yet so you can have
>>>> avg.runnable_avg_sum greater than avg.runnable_avg_period. In a worst
>>>> case, runnable_avg_sum could be twice runnable_avg_period
>>>
>>> Oh, That's a serious problem. Do you catch it in real word or in code?
>>
>> I haven't trace that shows this issue but nothing prevent an update to
>> occur while you get values so you can have a mix of old and new
>> values.
>>
>>> Could you explain more for details?
>>
>> Both fields of a new task increase simultaneously but if you get the
>> old value for runnable_avg_period and the new one for
>> runnable_avg_sum, runnable_avg_sum will be greater than
>> runnable_avg_period during this starting phase.
>>
>> The worst case appears 2ms after the creation of the task,
>> runnable_avg_period and runnable_avg_sum should go from 1024 to 2046.
>> So the task_h_load_avg will be 199% of task_h_load If you have
>> runnable_avg_period with 1024 and runnable_avg_sum with 2046.
>
> Thanks a lot for info sharing! Vincent.
>
> But I checked the rq->avg and task->se.avg, seems none of them are
> possible be updated on different CPU at the same time. So my printk
> didn't catch this with benchmark kbuild and aim7 on my SNB EP box.

The problem can happen because reader and writer are accessing the
variable asynchronously and on different CPUs

CPUA write runnable_avg_sum
CPUB read runnable_avg_sum
CPUB read runnable_avg_period
CPUA write runnable_avg_period

I agree that the time window, during which this can occur, is short
but not impossible

Vincent
>
> Then I find some words in your commit log:
> "If a CPU accesses the runnable_avg_sum and runnable_avg_period fields
> of its buddy CPU while the latter updates it, it can get the new version
> of a field and the old version of the other one."
> So is it possible caused by the buddy cpu's accessing?
> Could you like to recheck this without your patch?
>
>
> --
> Thanks
> Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/