Re: [PATCH v3 1/3] sched/fair: Add asymmetric CPU capacity wakeup scan

From: Valentin Schneider
Date: Wed Jan 29 2020 - 07:10:09 EST


On 29/01/2020 10:38, Dietmar Eggemann wrote:
> On 28/01/2020 12:30, Valentin Schneider wrote:
>> Hi Pavan,
>>
>> On 28/01/2020 06:22, Pavan Kondeti wrote:
>>> Hi Valentin,
>>>
>>> On Sun, Jan 26, 2020 at 08:09:32PM +0000, Valentin Schneider wrote:
>
> [...]
>
>>>> +
>>>> + if (!static_branch_unlikely(&sched_asym_cpucapacity))
>>>> + return -1;
>
> We do need this one to bail out quickly on non CPU asym systems. (1)
>
>>>> + sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, target));
>>>> + if (!sd)
>>>> + return -1;
>
> And I assume we can't return target here because of exclusive cpusets
> which can form symmetric CPU capacities islands on a CPU asymmetric
> system? (2)
>

Precisely, the "canonical" check for asymmetry is static key + SD pointer.
In terms of functionality we could "just" check sd_asym_cpucapacity (it can't
be set without having the static key set, though the reverse isn't true),
but we *want* to use the static key here to make SMP people happy.

>> That's not the case anymore, so indeed we may be able to bail out of
>> select_idle_sibling() right after select_idle_capacity() (or after the
>> prev / recent_used_cpu checks). Our only requirement here is that sd_llc
>> remains a subset of sd_asym_cpucapacity.
>
> How do you distinguish '-1' in (1), (2) and 'best_cpu = -1' (3)?
>
> In (1) and (2) you want to check if target is idle (or sched_idle) but
> in (3) you probably only want to check 'recent_used_cpu'?
>

Right, when we come back from select_idle_capacity(), and we did go through
the CPU loop, but we still returned -1, it means all CPUs in sd_asym_cpucapacity
were not idle. This includes 'target' of course, so we shouldn't need to
check it again.

In those cases we might still not have evaluated 'prev' or 'recent_used_cpu'.
It's somewhat of a last ditch attempt to find an idle CPU, and they'll only
help when they aren't in sd_asym_cpucapacity. I'm actually curious as to how
much the 'recent_used_cpu' thing helps, I've never paid it much attention.

So yeah my options are (for asym CPU capacity topologies):
a) just bail out after select_idle_capacity()
b) try to squeeze out a bit more out of select_idle_sibling() by also doing
the 'prev' & 'recent_used_cpu' checks.

a) is quite easy to implement; I can just inline the static key and sd checks
in select_idle_sibling() and return unconditionally once I'm past those
checks.

b) is more intrusive and I don't have a clear picture yet as to how much it
will really bring to the table.

I'll think about it and try to play around with both of these.