Re: [PATCH] sched/fair: Consider RT/IRQ pressure in capacity_spare_wake

From: Vincent Guittot
Date: Thu Dec 14 2017 - 10:47:24 EST


Hi Joel,

On 13 December 2017 at 21:00, Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
> On Mon, Dec 11, 2017 at 4:43 PM, Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
>> Hi Vincent,
>>
>>>>
>>>>>
>>>>>> ------------------------------------------------------------
>>>>>> Here we have RT activity running on big CPU cluster induced with rt-app,
>>>>>> and running hackbench in parallel. The RT tasks are bound to 4 CPUs on
>>>>>> the big cluster (cpu 4,5,6,7) and have 100ms periodicity with
>>>>>> runtime=20ms sleep=80ms.
>>>>>>
>>>>>> Hackbench shows big benefit (30%) improvement when number of tasks is 8
>>>>>> and 32: Note: data is completion time in seconds (lower is better).
>>>>>> Number of loops for 8 and 16 tasks is 50000, and for 32 tasks its 20000.
>>>>>> +--------+-----+-------+-------------------+---------------------------+
>>>>>> | groups | fds | tasks | Without Patch | With Patch |
>>>>>> +--------+-----+-------+---------+---------+-----------------+---------+
>>>>>> | | | | Mean | Stdev | Mean | Stdev |
>>>>>> | | | +-------------------+-----------------+---------+
>>>>>> | 1 | 8 | 8 | 1.0534 | 0.13722 | 0.7293 (+30.7%) | 0.02653 |
>>>>>> | 2 | 8 | 16 | 1.6219 | 0.16631 | 1.6391 (-1%) | 0.24001 |
>>>>>> | 4 | 8 | 32 | 1.2538 | 0.13086 | 1.1080 (+11.6%) | 0.16201 |
>>>>>> +--------+-----+-------+---------+---------+-----------------+---------+
>>>>>
>>>>> Out of curiosity, do you know why you don't see any improvement for
>>>>> 16 tasks but only for 8 and 32 tasks ?
>>>>
>>>> Yes I'm not fully sure why 16 tasks didn't show that much improvement.
>>>
>>> Yes. This is just to make sure that there no unexpected side effect
>>
>
> It could have been sloppy testing - I could have hit thermal
> throttling or forgotten to stop Android runtime before running the
> test. Looking at my old data, the case for 16 tasks has higher
> completion times than 32 tasks which doesn't make sense. Sorry about
> that. I was careful this time, I recreated the product tree and
> applied patch - ran the same test as in this patch, the data prefixed
> with "with" is with patch and "without" is without patch.
>
> The naming of the Test column is "<test>-<numFDs>-<numGroups>". Data
> is completion time of hackbench in seconds.
>
> RUN 1:
>
> Test Mean Median Stddev
> with-f4-1g 0.67645 (+3.7%) 0.68000 (+3.8%) 0.025755
> with-f4-2g 1.0685 (-0.3%) 1.0570 (+1%) 0.044122
> with-f4-4g 1.7558 (+0.7%) 1.7685 (+0.08%) 0.096015
>
> without-f4-1g 0.70255 0.70750 0.025330
> without-f4-2g 1.0653 1.0680 0.040300
> without-f4-4g 1.7688 1.7670 0.046341
>
> RUN 2:
>
> Test Mean Median Stddev
> with-f4-1g 0.68100 (+1%) 0.67800 (+2%) 0.025543
> with-f4-2g 1.0242 (+1.5%) 1.0260 (+1.5%) 0.042886
> with-f4-4g 1.6100 (+3%) 1.6075 (+3.7%) 0.052677
>
> without-f4-1g 0.68840 0.69150 0.030988
> without-f4-2g 1.0400 1.0420 0.034288
> without-f4-4g 1.6636 1.6670 0.056963
>
>
> Let me know what you think, thanks.

The improvement has decreased compared to previous results and there
is instability between your runs; As an example, run2 without patch
does better than run1 with patchs for 2g and 4g.
Could you run tests on a SMP linux kernel instead of big/LITTLE
android in order to have a saner test environnement and remove some
possible disturbances

Vincent
>
> - Joel