Re: PROBLEM: Persistent unfair sharing of a processor by auto groupsin 3.11-rc2 (has twice regressed)

From: Paul Turner
Date: Fri Jul 26 2013 - 21:09:14 EST


On Fri, Jul 26, 2013 at 2:50 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Fri, Jul 26, 2013 at 02:24:50PM -0700, Paul Turner wrote:
>> On Fri, Jul 26, 2013 at 2:03 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> >
>> >
>> > OK, so I have the below; however on a second look, Paul, shouldn't that
>> > update_cfs_shares() call be in entity_tick(), right after calling
>> > update_cfs_rq_blocked_load(). Because placing it in
>> > update_cfs_rq_blocked_load() means its now called twice on the
>> > enqueue/dequeue paths through:
>> >
>> > {en,de}queue_entity()
>> > {en,de}queue_entity_load_avg()
>> > update_cfs_rq_blocked_load()
>> > update_cfs_shares()
>>
>> Yes, I agree: placing it directly in entity_tick() would be better.
>
> OK, how about the below then?

Looks good.

>
>> [ In f269ae046 the calls to update_cfs_rq_blocked_load() were amortized
>> and the separate update in {en,de}queue_entity_load_avg() were
>> removed. ]
>
> Right, I remember/saw that. Did you ever figure out why that regressed;
> as in should we look to bring some of that back?

Yes, the savings are measurable (we actually still use it internally).

So the particular problem in Linus's workload was that the amortization meant
that there was more delay until the first update for a newly created task. This
then had negative interactivity with a make -j <N> workload since it allowed the
tasks to be over-represented in terms of the group shares they received.

With:
a75cdaa9: "sched: Set an initial value of runnable avg for new forked task"

This should now be improved so we should look at bringing it back.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/