Re: [PATCH] sched/fair: Consider RT/IRQ pressure in capacity_spare_wake

From: Joel Fernandes
Date: Thu Dec 14 2017 - 12:08:35 EST


Hi Vincent,
Thanks for your reply.

On Thu, Dec 14, 2017 at 7:46 AM, Vincent Guittot
<vincent.guittot@xxxxxxxxxx> wrote:
> Hi Joel,
>
> On 13 December 2017 at 21:00, Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
>> On Mon, Dec 11, 2017 at 4:43 PM, Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
>>> Hi Vincent,
>>>
>>>>>
>>>>>>
>>>>>>> ------------------------------------------------------------
>>>>>>> Here we have RT activity running on big CPU cluster induced with rt-app,
>>>>>>> and running hackbench in parallel. The RT tasks are bound to 4 CPUs on
>>>>>>> the big cluster (cpu 4,5,6,7) and have 100ms periodicity with
>>>>>>> runtime=20ms sleep=80ms.
>>>>>>>
>>>>>>> Hackbench shows big benefit (30%) improvement when number of tasks is 8
>>>>>>> and 32: Note: data is completion time in seconds (lower is better).
>>>>>>> Number of loops for 8 and 16 tasks is 50000, and for 32 tasks its 20000.
>>>>>>> +--------+-----+-------+-------------------+---------------------------+
>>>>>>> | groups | fds | tasks | Without Patch | With Patch |
>>>>>>> +--------+-----+-------+---------+---------+-----------------+---------+
>>>>>>> | | | | Mean | Stdev | Mean | Stdev |
>>>>>>> | | | +-------------------+-----------------+---------+
>>>>>>> | 1 | 8 | 8 | 1.0534 | 0.13722 | 0.7293 (+30.7%) | 0.02653 |
>>>>>>> | 2 | 8 | 16 | 1.6219 | 0.16631 | 1.6391 (-1%) | 0.24001 |
>>>>>>> | 4 | 8 | 32 | 1.2538 | 0.13086 | 1.1080 (+11.6%) | 0.16201 |
>>>>>>> +--------+-----+-------+---------+---------+-----------------+---------+
>>>>>>
>>>>>> Out of curiosity, do you know why you don't see any improvement for
>>>>>> 16 tasks but only for 8 and 32 tasks ?
>>>>>
>>>>> Yes I'm not fully sure why 16 tasks didn't show that much improvement.
>>>>
>>>> Yes. This is just to make sure that there no unexpected side effect
>>>
>>
>> It could have been sloppy testing - I could have hit thermal
>> throttling or forgotten to stop Android runtime before running the
>> test. Looking at my old data, the case for 16 tasks has higher
>> completion times than 32 tasks which doesn't make sense. Sorry about
>> that. I was careful this time, I recreated the product tree and
>> applied patch - ran the same test as in this patch, the data prefixed
>> with "with" is with patch and "without" is without patch.
>>
>> The naming of the Test column is "<test>-<numFDs>-<numGroups>". Data
>> is completion time of hackbench in seconds.
>>
>> RUN 1:
>>
>> Test Mean Median Stddev
>> with-f4-1g 0.67645 (+3.7%) 0.68000 (+3.8%) 0.025755
>> with-f4-2g 1.0685 (-0.3%) 1.0570 (+1%) 0.044122
>> with-f4-4g 1.7558 (+0.7%) 1.7685 (+0.08%) 0.096015
>>
>> without-f4-1g 0.70255 0.70750 0.025330
>> without-f4-2g 1.0653 1.0680 0.040300
>> without-f4-4g 1.7688 1.7670 0.046341
>>
>> RUN 2:
>>
>> Test Mean Median Stddev
>> with-f4-1g 0.68100 (+1%) 0.67800 (+2%) 0.025543
>> with-f4-2g 1.0242 (+1.5%) 1.0260 (+1.5%) 0.042886
>> with-f4-4g 1.6100 (+3%) 1.6075 (+3.7%) 0.052677
>>
>> without-f4-1g 0.68840 0.69150 0.030988
>> without-f4-2g 1.0400 1.0420 0.034288
>> without-f4-4g 1.6636 1.6670 0.056963
>>
>>
>> Let me know what you think, thanks.
>
> The improvement has decreased compared to previous results and there

Yes but the previous result was invalid as I mentioned, I controlled
the environment better this time. Previous result showed 4g completed
quicker than 2g which wasn't very meaningful.

> is instability between your runs; As an example, run2 without patch
> does better than run1 with patchs for 2g and 4g.

That's true. The improvement percent isn't stable.

> Could you run tests on a SMP linux kernel instead of big/LITTLE
> android in order to have a saner test environnement and remove some
> possible disturbances

Would it be Ok with you if I just dropped this synthetic test from the
patch since there are other hackbench results (case 3) from Rohit
which are on SMP?

Thanks,

- Joel