Re: [PATCH RESEND v2 1/2] clk: core: Honor CLK_OPS_PARENT_ENABLE for clk gate ops

From: Alexander Stein
Date: Wed Aug 31 2022 - 03:40:23 EST


Am Dienstag, 30. August 2022, 15:40:53 CEST schrieb Chen-Yu Tsai:
> On Tue, Aug 30, 2022 at 8:37 PM Alexander Stein
>
> <alexander.stein@xxxxxxxxxxxxxxx> wrote:
> > Hi,
> >
> > Am Montag, 29. August 2022, 11:22:16 CEST schrieb Chen-Yu Tsai:
> > > Hi,
> > >
> > > On Fri, Aug 26, 2022 at 8:28 PM Alexander Stein
> > >
> > > <alexander.stein@xxxxxxxxxxxxxxx> wrote:
> > > > Hi everybody,
> > > >
> > > > Am Montag, 22. August 2022, 10:14:23 CEST schrieb Chen-Yu Tsai:
> > > > > In the previous commits that added CLK_OPS_PARENT_ENABLE, support
> > > > > for
> > > > > this flag was only added to rate change operations (rate setting and
> > > > > reparent) and disabling unused subtree. It was not added to the
> > > > > clock gate related operations. Any hardware driver that needs it for
> > > > > these operations will either see bogus results, or worse, hang.
> > > > >
> > > > > This has been seen on MT8192 and MT8195, where the imp_ii2_* clk
> > > > > drivers set this, but dumping debugfs clk_summary would cause it
> > > > > to hang.
> > > > >
> > > > > Fixes: fc8726a2c021 ("clk: core: support clocks which requires
> > > > > parents
> > > > > enable (part 2)") Fixes: a4b3518d146f ("clk: core: support clocks
> > > > > which
> > > > > requires parents enable (part 1)") Signed-off-by: Chen-Yu Tsai
> > > > > <wenst@xxxxxxxxxxxx>
> > > > > Reviewed-by: Nícolas F. R. A. Prado <nfraprado@xxxxxxxxxxxxx>
> > > > > Tested-by: Nícolas F. R. A. Prado <nfraprado@xxxxxxxxxxxxx>
> > > > > ---
> > > > >
> > > > > drivers/clk/clk.c | 28 ++++++++++++++++++++++++++++
> > > > > 1 file changed, 28 insertions(+)
> > > > >
> > > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > > > index 7fc191c15507..9b365cd6d14b 100644
> > > > > --- a/drivers/clk/clk.c
> > > > > +++ b/drivers/clk/clk.c
> > > > > @@ -196,6 +196,9 @@ static bool clk_core_rate_is_protected(struct
> > > > > clk_core
> > > > > *core) return core->protect_count;
> > > > >
> > > > > }
> > > > >
> > > > > +static int clk_core_prepare_enable(struct clk_core *core);
> > > > > +static void clk_core_disable_unprepare(struct clk_core *core);
> > > > > +
> > > > >
> > > > > static bool clk_core_is_prepared(struct clk_core *core)
> > > > > {
> > > > >
> > > > > bool ret = false;
> > > > >
> > > > > @@ -208,7 +211,11 @@ static bool clk_core_is_prepared(struct
> > > > > clk_core
> > > > > *core) return core->prepare_count;
> > > > >
> > > > > if (!clk_pm_runtime_get(core)) {
> > > > >
> > > > > + if (core->flags & CLK_OPS_PARENT_ENABLE)
> > > > > + clk_core_prepare_enable(core->parent);
> > > > >
> > > > > ret = core->ops->is_prepared(core->hw);
> > > > >
> > > > > + if (core->flags & CLK_OPS_PARENT_ENABLE)
> > > > > + clk_core_disable_unprepare(core->parent);
> > > > >
> > > > > clk_pm_runtime_put(core);
> > > > >
> > > > > }
> > > > >
> > > > > @@ -244,7 +251,13 @@ static bool clk_core_is_enabled(struct clk_core
> > > > > *core)
> > > > >
> > > > > }
> > > > >
> > > > > }
> > > > >
> > > > > + if (core->flags & CLK_OPS_PARENT_ENABLE)
> > > > > + clk_core_prepare_enable(core->parent);
> > > > > +
> > > > >
> > > > > ret = core->ops->is_enabled(core->hw);
> > > > >
> > > > > +
> > > > > + if (core->flags & CLK_OPS_PARENT_ENABLE)
> > > > > + clk_core_disable_unprepare(core->parent);
> > > > >
> > > > > done:
> > > > > if (core->rpm_enabled)
> > > > >
> > > > > pm_runtime_put(core->dev);
> > > > >
> > > > > @@ -812,6 +825,9 @@ int clk_rate_exclusive_get(struct clk *clk)
> > > > >
> > > > > }
> > > > > EXPORT_SYMBOL_GPL(clk_rate_exclusive_get);
> > > > >
> > > > > +static int clk_core_enable_lock(struct clk_core *core);
> > > > > +static void clk_core_disable_lock(struct clk_core *core);
> > > > > +
> > > > >
> > > > > static void clk_core_unprepare(struct clk_core *core)
> > > > > {
> > > > >
> > > > > lockdep_assert_held(&prepare_lock);
> > > > >
> > > > > @@ -835,6 +851,9 @@ static void clk_core_unprepare(struct clk_core
> > > > > *core)
> > > > >
> > > > > WARN(core->enable_count > 0, "Unpreparing enabled %s\n", core-
> > > > >
> > > > >name);
> > > > >
> > > > > + if (core->flags & CLK_OPS_PARENT_ENABLE)
> > > > > + clk_core_enable_lock(core->parent);
> > > > > +
> > > > >
> > > > > trace_clk_unprepare(core);
> > > > >
> > > > > if (core->ops->unprepare)
> > > > >
> > > > > @@ -843,6 +862,9 @@ static void clk_core_unprepare(struct clk_core
> > > > > *core)
> > > > >
> > > > > clk_pm_runtime_put(core);
> > > > >
> > > > > trace_clk_unprepare_complete(core);
> > > > >
> > > > > +
> > > > > + if (core->flags & CLK_OPS_PARENT_ENABLE)
> > > > > + clk_core_disable_lock(core->parent);
> > > > >
> > > > > clk_core_unprepare(core->parent);
> > > > >
> > > > > }
> > > > >
> > > > > @@ -891,6 +913,9 @@ static int clk_core_prepare(struct clk_core
> > > > > *core)
> > > > >
> > > > > if (ret)
> > > > >
> > > > > goto runtime_put;
> > > > >
> > > > > + if (core->flags & CLK_OPS_PARENT_ENABLE)
> > > > > + clk_core_enable_lock(core->parent);
> > > > > +
> > > > >
> > > > > trace_clk_prepare(core);
> > > > >
> > > > > if (core->ops->prepare)
> > > > >
> > > > > @@ -898,6 +923,9 @@ static int clk_core_prepare(struct clk_core
> > > > > *core)
> > > > >
> > > > > trace_clk_prepare_complete(core);
> > > > >
> > > > > + if (core->flags & CLK_OPS_PARENT_ENABLE)
> > > > > + clk_core_disable_lock(core->parent);
> > > > > +
> > > > >
> > > > > if (ret)
> > > > >
> > > > > goto unprepare;
> > > > >
> > > > > }
> > > >
> > > > Unfortunately this completely locks up my i.MX8M Plus based board
> > > > during
> > > > early boot.
> > > > I'm currently running on next-20220826 using
> > > > arch/arm64/boot/dts/freescale/
> > > > imx8mp-tqma8mpql-mba8mpxl.dts
> > > > Reverting this patch gets my board booting again. dmesg until hard
> > > > lockup
> > > > below.
> > >
> > > The standard logs don't have anything to go on. Could you add some
> > > printk
> > > calls to the clk core around the areas this patch touchs? That would
> > > help.
> > >
> > > Could you also provide a dump of /sys/kernel/debug/clk/clk_summary? That
> > > would help to understand the clock tree.
> >
> > Sure,
> >
> > These are the last kernel log lines before hard lockup:
> > [ 0.686357] io scheduler mq-deadline registered
> > [ 0.690907] io scheduler kyber registered
> > [ 0.699275] clk_core_prepare: clk: main_axi CLK_OPS_PARENT_ENABLE
> >
> > main_axi is also the only debug output up to this point. This is the used
> > patch for debugging:
> > ---8<---
> > --- a/drivers/clk/clk.c
> > +++ b/drivers/clk/clk.c
> > @@ -211,8 +211,10 @@ static bool clk_core_is_prepared(struct clk_core
> > *core)>
> > return core->prepare_count;
> >
> > if (!clk_pm_runtime_get(core)) {
> >
> > - if (core->flags & CLK_OPS_PARENT_ENABLE)
> > + if (core->flags & CLK_OPS_PARENT_ENABLE) {
> > + pr_info("%s: clk: %s CLK_OPS_PARENT_ENABLE\n",
> > __func__, core->name);
> >
> > clk_core_prepare_enable(core->parent);
> >
> > + }
> >
> > ret = core->ops->is_prepared(core->hw);
> > if (core->flags & CLK_OPS_PARENT_ENABLE)
> >
> > clk_core_disable_unprepare(core->parent);
> >
> > @@ -251,8 +253,10 @@ static bool clk_core_is_enabled(struct clk_core
> > *core)
> >
> > }
> >
> > }
> >
> > - if (core->flags & CLK_OPS_PARENT_ENABLE)
> > + if (core->flags & CLK_OPS_PARENT_ENABLE) {
> > + pr_info("%s: clk: %s CLK_OPS_PARENT_ENABLE\n", __func__,
> > core->
> > >name);
> > >
> > clk_core_prepare_enable(core->parent);
> >
> > + }
> >
> > ret = core->ops->is_enabled(core->hw);
> >
> > @@ -851,8 +855,10 @@ static void clk_core_unprepare(struct clk_core *core)
> >
> > WARN(core->enable_count > 0, "Unpreparing enabled %s\n",
> > core->name);
> >
> > - if (core->flags & CLK_OPS_PARENT_ENABLE)
> > + if (core->flags & CLK_OPS_PARENT_ENABLE) {
> > + pr_info("%s: clk: %s CLK_OPS_PARENT_ENABLE\n", __func__,
> > core->
> > >name);
> > >
> > clk_core_enable_lock(core->parent);
> >
> > + }
> >
> > trace_clk_unprepare(core);
> >
> > @@ -912,8 +918,10 @@ static int clk_core_prepare(struct clk_core *core)
> >
> > if (ret)
> >
> > goto runtime_put;
> >
> > - if (core->flags & CLK_OPS_PARENT_ENABLE)
> > + if (core->flags & CLK_OPS_PARENT_ENABLE) {
> > + pr_info("%s: clk: %s CLK_OPS_PARENT_ENABLE\n",
> > __func__, core->name);
> >
> > clk_core_enable_lock(core->parent);
> >
> > + }
> >
> > trace_clk_prepare(core);
> >
> > ---8<---
>
> Thanks. So the part of the clock tree that's problematic is this:
>
> osc_24m (fixed)
> sys_pll1_ref_sel (mux)
> sys_pll1 (imx pll14xx)
> sys_pll1_bypass (mux)
> sys_pll1_out (gate)
> sys_pll1_800m (fixed factor)
> main_axi (composite CLK_IS_CRITICAL)
>
> Would it be possible for you to produce a stack dump as well? And also
> enable lock debugging? This likely still won't catch anything if it's
> a spinlock deadlock though.

Sure thing, I added a dump_stack() call to all of the above debug outputs as
well as the name of the parent clock, just to be sure, and here is the result
output:
[ 1.142386] io scheduler mq-deadline registered
[ 1.146902] io scheduler kyber registered
[ 1.177345] clk_core_prepare: clk: main_axi CLK_OPS_PARENT_ENABLE
[ 1.180713] clk_core_prepare: clk->parent: sys_pll1_800m
[ 1.186025] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc3-
next-20220831+ #619 9c82e5b9075d735fa07e2c558603ffb720d72983
[ 1.197274] Hardware name: TQ-Systems i.MX8MPlus TQMa8MPxL on MBa8MPxL (DT)
[ 1.204275] Call trace:
[ 1.206723] dump_backtrace+0xd8/0x120
[ 1.210485] show_stack+0x14/0x50
[ 1.213811] dump_stack_lvl+0x88/0xb0
[ 1.217486] dump_stack+0x14/0x2c
[ 1.220811] clk_core_prepare+0x1fc/0x240
[ 1.224834] __clk_core_init+0x208/0x4dc
[ 1.228772] __clk_register+0x160/0x240
[ 1.232622] clk_hw_register+0x1c/0x5c
[ 1.236384] __clk_hw_register_composite+0x1e8/0x2d0
[ 1.241372] clk_hw_register_composite+0x40/0x50
[ 1.246009] __imx8m_clk_hw_composite+0x130/0x210
[ 1.250734] imx8mp_clocks_probe+0x13ac/0x3750
[ 1.255199] platform_probe+0x64/0x100
[ 1.258959] call_driver_probe+0x28/0x140
[ 1.262984] really_probe+0xc0/0x334
[ 1.266572] __driver_probe_device+0x84/0x144
[ 1.270947] driver_probe_device+0x38/0x130
[ 1.275147] __driver_attach+0xd4/0x240
[ 1.278997] bus_for_each_dev+0x6c/0xbc
[ 1.282847] driver_attach+0x20/0x30
[ 1.286436] bus_add_driver+0x178/0x250
[ 1.290284] driver_register+0x74/0x120
[ 1.294134] __platform_driver_register+0x24/0x30
[ 1.298859] imx8mp_clk_driver_init+0x18/0x20
[ 1.303234] do_one_initcall+0x48/0x180
[ 1.307084] do_initcalls+0x164/0x19c
[ 1.310759] kernel_init_freeable+0xf4/0x138
[ 1.315047] kernel_init+0x2c/0x140
[ 1.318549] ret_from_fork+0x10/0x20

Enabling lock debugging results in no additional output.

Best regards,
Alexander