Re: false nr_running check in load balance?

From: Paul Turner
Date: Thu Aug 15 2013 - 14:24:34 EST


On Thu, Aug 15, 2013 at 10:39 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Tue, Aug 13, 2013 at 01:08:17AM -0700, Paul Turner wrote:
>> On Tue, Aug 13, 2013 at 12:38 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> > On Tue, Aug 13, 2013 at 12:45:12PM +0800, Lei Wen wrote:
>> >> > Not quite right; I think you need busiest->cfs.h_nr_running.
>> >> > cfs.nr_running is the number of entries running in this 'group'. If
>> >> > you've got nested groups like:
>> >> >
>> >> > 'root'
>> >> > \
>> >> > 'A'
>> >> > / \
>> >> > t1 t2
>> >> >
>> >> > root.nr_running := 1 'A', even though you've got multiple running tasks.
>
> One thing though; doesn't h_nr_running over count the number of tasks?
> That is, doesn't it count the runnable entities so the above case would
> give root.h_nr_running := 3, where we would only have 2 runnable tasks.
>
> Double check this and be careful when doing the conversion.

This should be ok: it's accounted like rq->nr_running, not cfs_rq->nr_running.
Specifically: both only account tasks; group-entities do not contribute.

The fact that this distinction exists, despite the very similar names
is unfortunate.
We could consider renaming to h_nr_{running_,}tasks for clarity.
The same applies to rq->nr_running, although that would involve more churn.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/