Re: [PATCH] sched/fair: Fix negative energy delta in find_energy_efficient_cpu()

From: Dietmar Eggemann
Date: Fri Apr 23 2021 - 11:48:11 EST


On 22/04/2021 11:44, Pierre Gondois wrote:
> Hi Dietmar,
> Thanks for the review,
>
> On 4/20/21 6:25 PM, Dietmar Eggemann wrote:
>
>> On 20/04/2021 14:56, Pierre.Gondois@xxxxxxx wrote:
>>> From: Pierre Gondois <Pierre.Gondois@xxxxxxx>

[...]

>> Did you run some tests to make sure you didn't regress on energy
>> consumption? You could run EAS' Energy tests w/ and w/o the patch
>> depicted in:
>>
>> https://lkml.kernel.org/r/20181203095628.11858-1-quentin.perret@xxxxxxx
>
>
> I executed the energy test you pointed at using LISA on a Juno-r2 (2xA57
> + 4xA53). The initial tests made by Quentin was on a Juno-r0 and a
> Hikey960.
>
> To recall the test:
> "10 iterations of between 10 and 50 periodic rt-app tasks (16ms period,
> 5% duty-cycle) for 30 seconds with energy measurement. Unit is Joules.
> The goal is to save energy, so lower is better."
> "Energy is measured with the onboard energy meter. Numbers include
> consumption of big and little CPUs."
>
> +----------+-----------------+-------------------------+
> |          | Without patches | With patches            |
> +----------+--------+--------+------------------+------+
> | Tasks nb |  Mean  |    CI* | Mean             |  CI* |
> +----------+--------+--------+------------------+------+
> |       10 |   6.57 |   0.24 |   6.46 (-1.69%)  | 0.16 |
> |       20 |  12.44 |   0.21 |  12.40 (-0.33%)  | 0.11 |
> |       30 |  19.10 |   0.78 |  18.93 (-0.89%)  | 0.46 |
> |       40 |  27.27 |   0.53 |  27.49 (+0.81%   | 0.17 |
> |       50 |  36.55 |   0.42 |  37.21 (+1.80%)  | 0.81 |
> +----------+-----------------+-------------------------+
> CI: confidence interval
>
> For each line, the intervals of values w/ wo/ the patches are
> overlapping (consider Mean +/- CI). Thus, the energy results shouldn't
> have been impacted.

Put this into the patch header so people see some testing has been done.

[...]

>>> +        if (compute_prev_delta) {
>>> +            prev_delta = compute_energy(p, prev_cpu, pd);
>>> +            /* Prevent negative deltas and select prev_cpu */
>> Not sure if this comment helps in understanding the code. We don't
>> comment that we return prev_cpu if !task_util_est(p) or we're not
>> entering the `Pick the best CPU ...` condition.
> I thought it was not obvious how (prev_delta < base_energy_pd) could
> happen. I'm ok to remove the comment, but maybe a sentence should be
> added in the function header or somewhere else.

Agreed. Remove the commend and add text in the patch header to
illustrate how you `fix negative energy delta ...`.

[...]

> Same comment: I'm ok to remove it, but we should explain what happens
> somewhere, maybe in the function header.

Same here.

[...]

>>> @@ -6674,25 +6688,20 @@ static int find_energy_efficient_cpu(struct
>>> task_struct *p, int prev_cpu)
>>>               }
>>>           }
>>>       }
>>> -unlock:
>>> -    rcu_read_unlock();
>> You don't close the RCU read-side critical section here anymore but
>> include the following if condition as well. Don't we always want to
>> close them as quick as possible? We could still return target (prev_cpu)
>> after the if condition below ...
> The computation should not take that long and this would make less code.
> Putting back the rcu_read_unlock() and returning faster instead of
> having a fall-through would also work for me.

I see but I would stay on the save side and keep the RCU read-side
critical section as short as possible.

>>>         /*
>>> -     * Pick the best CPU if prev_cpu cannot be used, or if it saves at
>>> -     * least 6% of the energy used by prev_cpu.
>>> +     * Pick the best CPU if:
>>> +     *  - prev_cpu cannot be used, or
>>> +     *  - it saves at least 6% of the energy used by prev_cpu
>>>        */
>> Why changing the layout of this comment?
>
> I thought it was clearer to have bullet points. It can be reverted.

Please revert. Keep the changes as small as possible.

[...]