Re: [PATCH] mmc: host: dw-mmc-rockchip: fix handling invalid clock rates

From: Peter Geis
Date: Thu Mar 03 2022 - 19:44:47 EST


On Thu, Mar 3, 2022 at 4:28 PM Peter Geis <pgwipeout@xxxxxxxxx> wrote:
>
> On Thu, Mar 3, 2022 at 5:21 AM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
> >
> > On Thu, 3 Mar 2022 at 10:49, Peter Geis <pgwipeout@xxxxxxxxx> wrote:
> > >
> > > On Thu, Mar 3, 2022 at 2:53 AM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
> > > >
> > > > On Thu, 3 Mar 2022 at 02:52, Peter Geis <pgwipeout@xxxxxxxxx> wrote:
> > > > >
> > > > > The Rockchip ciu clock cannot be set as low as the dw-mmc hardware
> > > > > supports. This leads to a situation during card initialization where the
> > > > > ciu clock is set lower than the clock driver can support. The
> > > > > dw-mmc-rockchip driver spews errors when this happens.
> > > > > For normal operation this only happens a few times during boot, but when
> > > > > cd-broken is enabled (in cases such as the SoQuartz module) this fires
> > > > > multiple times each poll cycle.
> > > > >
> > > > > Fix this by testing the minimum frequency the clock driver can support
> > > > > that is within the mmc specification, then divide that by the internal
> > > > > clock divider. Set the f_min frequency to this value, or if it fails,
> > > > > set f_min to the downstream driver's default.
> > > > >
> > > > > Fixes: f629ba2c04c9 ("mmc: dw_mmc: add support for RK3288")
> > > > >
> > > > > Signed-off-by: Peter Geis <pgwipeout@xxxxxxxxx>
> > > > > ---
> > > > > drivers/mmc/host/dw_mmc-rockchip.c | 31 ++++++++++++++++++++++++++----
> > > > > 1 file changed, 27 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/drivers/mmc/host/dw_mmc-rockchip.c b/drivers/mmc/host/dw_mmc-rockchip.c
> > > > > index 95d0ec0f5f3a..c198590cd74a 100644
> > > > > --- a/drivers/mmc/host/dw_mmc-rockchip.c
> > > > > +++ b/drivers/mmc/host/dw_mmc-rockchip.c
> > > > > @@ -15,7 +15,9 @@
> > > > > #include "dw_mmc.h"
> > > > > #include "dw_mmc-pltfm.h"
> > > > >
> > > > > -#define RK3288_CLKGEN_DIV 2
> > > > > +#define RK3288_CLKGEN_DIV 2
> > > > > +#define RK3288_MIN_INIT_FREQ 375000
> > > > > +#define MMC_MAX_INIT_FREQ 400000
> > > > >
> > > > > struct dw_mci_rockchip_priv_data {
> > > > > struct clk *drv_clk;
> > > > > @@ -27,6 +29,7 @@ struct dw_mci_rockchip_priv_data {
> > > > > static void dw_mci_rk3288_set_ios(struct dw_mci *host, struct mmc_ios *ios)
> > > > > {
> > > > > struct dw_mci_rockchip_priv_data *priv = host->priv;
> > > > > + struct mmc_host *mmc = mmc_from_priv(host);
> > > > > int ret;
> > > > > unsigned int cclkin;
> > > > > u32 bus_hz;
> > > > > @@ -34,6 +37,10 @@ static void dw_mci_rk3288_set_ios(struct dw_mci *host, struct mmc_ios *ios)
> > > > > if (ios->clock == 0)
> > > > > return;
> > > > >
> > > > > + /* the clock will fail if below the f_min rate */
> > > > > + if (ios->clock < mmc->f_min)
> > > > > + ios->clock = mmc->f_min;
> > > > > +
> > > >
> > > > You shouldn't need this. The mmc core should manage this already.
> > >
> > > I thought so too, but while setting f_min did reduce the number of
> > > errors, it didn't stop them completely.
> > > Each tick I was getting three failures, it turns out mmc core tries
> > > anyways with 300000, 200000, and 100000.
> > > Clamping it here was necessary to stop these.
> >
> > Ohh, that was certainly a surprise to me. Unless the dw_mmc driver
> > invokes this path on it's own in some odd way, that means the mmc core
> > has a bug that we need to fix.
> >
> > Would you mind taking a stack trace or debug this so we understand in
> > what case the mmc core doesn't respect f_min? It really should.
>
> I thought it was odd too, will check into where it's happening.
> Thanks!

[ 11.376608] Hardware name: Pine64 RK3566 Quartz64-A Board (DT)
[ 11.377127] Workqueue: events_freezable mmc_rescan
[ 11.377567] Call trace:
[ 11.377788] dump_backtrace.part.0+0xd8/0xe4
[ 11.378177] show_stack+0x24/0x80
[ 11.378479] dump_stack_lvl+0x68/0x84
[ 11.378812] dump_stack+0x1c/0x38
[ 11.379111] dw_mci_rk3288_set_ios+0x128/0x150
[ 11.379512] dw_mci_set_ios+0xb0/0x280
[ 11.379849] mmc_power_up.part.0+0xd0/0x17c
[ 11.380225] mmc_rescan+0x184/0x2f0
[ 11.380538] process_one_work+0x1e0/0x48c
[ 11.380901] worker_thread+0x148/0x46c
[ 11.381238] kthread+0x100/0x110
[ 11.381530] ret_from_fork+0x10/0x20

Seems to be happening here:
https://elixir.bootlin.com/linux/latest/source/drivers/mmc/core/core.c#L2233
But it should be guarded.
I'm continuing to dig into it.

>
> >
> > [...]
> >
> > Kind regards
> > Uffe