Re: Re: [PATCH v2] cpufreq: Register with perf domain before

From: Vincent Wang3
Date: Wed Jan 18 2023 - 05:53:16 EST


Hi, Lukasz, Viresh

I found this issue on Android phone with kernel 5.15, and the governor is schedutil.
With the pd_init issue, the rd->pd will be NULL and EAS doesn't work since rd->pd will be checked in function find_energy_efficient_cpu( ).

I didn't notice the modification in schedutil at mainline, so I submitted this patch to fix the problem, sorry!

However, besides EAS will check rd->pd, I noticed that it's also used in the find_busiest_group() and update_cpu_capacity() functions at mainline, so I'm worried that may not be enough to circumvent it only in schedutil.


detail of the exact code path:
in function register_cpufreq_notifier( ), the notifier_block init_cpu_capacity_notifier is registered with CPUFREQ_POLICY_NOTIFIER.
During the initialization of a new policy, the notifier_call function init_cpu_capacity_callback( ) will be called,in which the work update_topology_flags_work will be scheduled.

for the kwork function update_topology_flags_workfn( ):
update_topology_flags_workfn( ) -> rebuild_sched_domains( ) -> rebuild_sched_domains_locked( ) -> partition_sched_domains_locked( ) -> build_perf_domains( ) -> pd_init( )


BRs
Vincent

-----邮件原件-----
发件人: Lukasz Luba <lukasz.luba@xxxxxxx>
发送时间: 2023年1月18日 17:24
收件人: Viresh Kumar <viresh.kumar@xxxxxxxxxx>; Vincent Wang <bhuwz@xxxxxxx>; Vincent Wang3 <vincentwang3@xxxxxxxxxx>
抄送: rafael@xxxxxxxxxx; linux-pm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
主题: [External] Re: [PATCH v2] cpufreq: Register with perf domain before

Hi Viresh, Vincent

I'm surprised seeing this thread and thanks that you Viresh have answered, so it could go through my spam/junk filters.
(I hope when answer to that domain it would change something).

On 1/18/23 08:49, Viresh Kumar wrote:
> On 18-01-23, 12:47, Vincent Wang wrote:
>> From: Vincent Wang <vincentwang3@xxxxxxxxxx>
>>
>> We found the following issue during kernel boot on android phone:
>>
>> [ 1.325272][ T1] cpu cpu0: EM: created perf domain
>> [ 1.329317][ T1] cpu cpu4: EM: created perf domain
>> [ 1.337597][ T76] pd_init: no EM found for CPU7
>> [ 1.350849][ T1] cpu cpu7: EM: created perf domain
>>
>> pd init for cluster2 is executed in a kworker thread and is earlier
>> than the perf domain creation for cluster2.
>
> Can you please give detail of the exact code path, for mainline kernel
> ? I am not sure which kworker thread are you talking about here.

Please also tell us your cpufreq governor. The schedutil governor at mainline can handle those situations and we rebuild the perf domains here [1].

>
>> pd_init() is called from the cpufreq notification of
>> CPUFREQ_CREATE_POLICY in cpufreq_online(), which is earlier than that
>> cpufreq_driver->register_em() is called.

Viresh, If that's an issue for other governors, than maybe we should address that. IMO the patch just relies on side-effect in arch_topology.c update_topology_flags_workfn(). That would be a workaround and dangerous if someone would change that arch_topology.c design. Shouldn't we come up with something reliable inside the cpufreq.c if there is a real issue?
Even now, these two mechanisms:
1. in the schedutil [1]
2. in the update_topology_flags_workfn() are a bit leaky (in terms of design holes).

Regards,
Lukasz

[1]
https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fkernel%2Fsched%2Fcpufreq_schedutil.c%23L858&data=05%7C01%7Cvincentwang3%40lenovo.com%7Cc69b7dca4401474bd23808daf935bfd5%7C5c7d0b28bdf8410caa934df372b16203%7C0%7C0%7C638096306507392144%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WTPpWDytIsbfdW920cGNiHhEs7udgrX0O78sAh8JV7I%3D&reserved=0