Re: [PATCH v2] clk: samsung: Prevent potential endless loop in the PLL set_rate ops

From: Sylwester Nawrocki
Date: Tue Aug 11 2020 - 12:45:24 EST


Hi Tomasz,

On 11.08.2020 14:59, Tomasz Figa wrote:
> 2020年8月11日(火) 13:25 Sylwester Nawrocki <s.nawrocki@xxxxxxxxxxx>:
>>
>> In the .set_rate callback for some PLLs there is a loop polling state
>> of the PLL lock bit and it may become an endless loop when something
>> goes wrong with the PLL. For some PLLs there is already (a duplicated)
>> code for polling with timeout. This patch replaces that code with
>> the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
>> helper function, which is then used for all the PLLs. The downside
>> of switching to the common macro is that we drop the cpu_relax() call.
>
> Tbh. I'm not sure what effect was exactly expected from cpu_relax() in
> the functions which already had timeout handling. Could someone shed
> some light on this?
>
>> Using a common helper function rather than the macro directly allows
>> to avoid repeating the error message in the code and to avoid the object
>> code size increase due to inlining.
>>
>> Signed-off-by: Sylwester Nawrocki <s.nawrocki@xxxxxxxxxxx>
>> ---
>> Changes for v2:
>> - use common readl_relaxed_poll_timeout_atomic() macro
>> ---
>> drivers/clk/samsung/clk-pll.c | 92 +++++++++++++++----------------------------
>> 1 file changed, 32 insertions(+), 60 deletions(-)
>>
>> diff --git a/drivers/clk/samsung/clk-pll.c b/drivers/clk/samsung/clk-pll.c
>> index ac70ad7..c3c1efe 100644
>> --- a/drivers/clk/samsung/clk-pll.c
>> +++ b/drivers/clk/samsung/clk-pll.c
>> @@ -9,13 +9,14 @@

>> -#define PLL_TIMEOUT_MS 10
>> +#define PLL_TIMEOUT_US 10000U
>
> I'm also wondering if 10ms is the universal value that would cover the
> oldest PLLs as well, but my loose recollection is that they should
> still lock much faster than that. Could you double check that in the
> documentation?

Thanks for your comments.

The oldest PLLs have a hard coded 300 us waiting time for PLL lock and
are not affected by the patch.
I have checked some of the PLLs and maximum observed lock time was around
370 us and most of the time it was just a few us.

We calculate the lock time in each set_rate op, in the oscillator cycle
units, as a product of current P divider value and a constant PLL type
specific LOCK_FACTOR. Maximum possible P value is 64, maximum possible
LOCK_FACTOR is 3000. Assuming minimum VCO frequency of 24 MHz (which
I think will usually be much higher than that) maximum lock time
would be (64 x 3000) / 24 MHz = 8 ms. I think we can leave the current
10 ms value.

But there is other issue, it seems we can't really use the ktime API
in the set_rate callbacks, as these could be called early, before the
clocksource is initialized and ktime doesn't work yet. Below trace
is from a dump_stack() added to the samsung_pll_lock_wait() callback.
The PLL rate setting is triggered by assigned-clock* properties in
the clock supplier node.
I think we need to switch to a simple udelay() loop, as is done in
clk-tegra210 driver for instance.

[ 0.000000] Hardware name: Samsung Exynos (Flattened Device Tree)
[ 0.000000] [<c0111e9c>] (unwind_backtrace) from [<c010d0ec>] (show_stack+0x10/0x14)
[ 0.000000] [<c010d0ec>] (show_stack) from [<c051d890>] (dump_stack+0xac/0xd8)
[ 0.000000] [<c051d890>] (dump_stack) from [<c0578d94>] (samsung_pll_lock_wait+0x14/0x174)
[ 0.000000] [<c0578d94>] (samsung_pll_lock_wait) from [<c057319c>] (clk_change_rate+0x1a8/0x8ac)
[ 0.000000] [<c057319c>] (clk_change_rate) from [<c0573aec>] (clk_core_set_rate_nolock+0x24c/0x268)
[ 0.000000] [<c0573aec>] (clk_core_set_rate_nolock) from [<c0573b38>] (clk_set_rate+0x30/0x64)
[ 0.000000] [<c0573b38>] (clk_set_rate) from [<c0577df8>] (of_clk_set_defaults+0x214/0x384)
[ 0.000000] [<c0577df8>] (of_clk_set_defaults) from [<c0572f34>] (of_clk_add_hw_provider+0x98/0xd8)
[ 0.000000] [<c0572f34>] (of_clk_add_hw_provider) from [<c1120278>] (samsung_clk_of_add_provider+0x1c/0x30)
[ 0.000000] [<c1120278>] (samsung_clk_of_add_provider) from [<c1121844>] (exynos5250_clk_of_clk_init_driver+0x1f4/0x240)
[ 0.000000] [<c1121844>] (exynos5250_clk_of_clk_init_driver) from [<c11200d0>] (of_clk_init+0x16c/0x218)
[ 0.000000] [<c11200d0>] (of_clk_init) from [<c1104bdc>] (time_init+0x24/0x30)
[ 0.000000] [<c1104bdc>] (time_init) from [<c1100d20>] (start_kernel+0x3b0/0x520)
[ 0.000000] [<c1100d20>] (start_kernel) from [<00000000>] (0x0)
[ 0.000000] samsung_pll_lock_wait: PLL fout_epll, lock time: 0 us, ret: 0
[ 0.000000] Exynos5250: clock setup completed, armclk=1700000000
[ 0.000000] Switching to timer-based delay loop, resolution 41ns
[ 0.000000] clocksource: mct-frc: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
[ 0.000003] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
[ 0.000032] genirq: irq_chip COMBINER did not update eff. affinity mask of irq 49
[ 0.000523] arch_timer: cp15 timer(s) running at 24.00MHz (virt).
[ 0.000536] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x588fe9dc0, max_idle_ns: 440795202592 ns
[ 0.000551] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 4398046511097ns

--
Regards,
Sylwester