Re: [PATCH] mmc: Some Micron eMMC devices cause reboot to hang

From: Ulf Hansson
Date: Wed Aug 05 2020 - 04:28:13 EST


On Mon, 27 Jul 2020 at 15:07, Alan Cooper <alcooperx@xxxxxxxxx> wrote:
>
> On Fri, Jul 24, 2020 at 7:03 AM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
> >
> > On Tue, 21 Jul 2020 at 21:18, Al Cooper <alcooperx@xxxxxxxxx> wrote:
> > >
> > > When using eMMC as the boot device, some Micron eMMC devices will
> > > cause reboot to hang. This is a result of the eMMC device not going
> > > into boot mode after the hardware sends CMD0 to reset the eMMC
> > > device. This only happens if the kernel driver sends CMD5 (SLEEP_WAKE),
> > > to put the device into sleep state, before restarting the system.
> >
> > What do you mean by "boot mode"?
>
> I'm referring to the "Boot operation mode" described in Section 6.3 of
> the JEDEC spec.
> Our hardware will send a CMD0 with 0xf0f0f0f0 argument at powerup or
> when the SoC is reset, and then hold the CLK signal low for 74 clock
> cycles. This should put the eMMC device into boot mode where it
> streams consecutive blocks without additional commands. With this
> Micron device I find that if I send a CMD5 before trying to restart
> the system by resetting the SoC, that the system hangs. I worked with
> Micron on the issue and they finally said to either avoid sending the
> CMD5 on restart or use a newer version of the Micron eMMC device.

Thanks for clarifying the test sequence!

However, I am still not (yet) convinced that a card quirk is the right
thing to do. What does the eMMC spec say about sending a CMD0 with
0xf0f0f0f0 to a device that "sleeps"?

Moreover, how does your mmc host driver (and platform) treat VCC and
VCCQ at system suspend/resume, compared to when a reset is done? Is
there a difference?

The point is, if the eMMC spec is being violated, we should not make a
card quirk - as it may cause problems for other platforms.

>
>
> >
> > When the kernel sends the CMD0 to wake up the eMMC from sleep, at
> > system resume for example, it all works fine, I guess. What is the
> > difference?
>
> On system resume the hardware will not try to put the eMMC device back
> into boot mode.

I see.

Does your host driver support HW busy signalling, so DAT0 is monitored
for de-assertion to confirm the CMD5 is completed by the kernel - or
do you rely on the per card sleep timeout to be used in mmc_sleep()?

Additionally, I wonder about what options you have to reset the eMMC?
Can we use something along the lines of
drivers/mmc/core/pwrseq_emmc.c? If it's not possible to do a HW reset,
we could try sending CMD0 with argument being '0' in the reset path.

What do you think?

>
> Al
>
> >
> > > The fix is to add a quirk that avoids sending the SLEEP command
> > > and to use MMC_FIXUP to set the quirk for these Micron devices.
> >
> > I am not sure this is Micron device specific, but rather some it's a
> > driver/platform bug. Maybe on the kernel side or in the bootloader
> > code.
> >

Kind regards
Uffe

> >
> > >
> > > Signed-off-by: Al Cooper <alcooperx@xxxxxxxxx>
> > > ---
> > > drivers/mmc/core/mmc.c | 3 ++-
> > > drivers/mmc/core/quirks.h | 8 ++++++++
> > > include/linux/mmc/card.h | 1 +
> > > 3 files changed, 11 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
> > > index 4203303f946a..4d69e8f8fe59 100644
> > > --- a/drivers/mmc/core/mmc.c
> > > +++ b/drivers/mmc/core/mmc.c
> > > @@ -1895,7 +1895,8 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
> > >
> > > static int mmc_can_sleep(struct mmc_card *card)
> > > {
> > > - return (card && card->ext_csd.rev >= 3);
> > > + return card && card->ext_csd.rev >= 3 &&
> > > + ((card->quirks & MMC_QUIRK_BROKEN_SLEEP) == 0);
> > > }
> > >
> > > static int mmc_sleep(struct mmc_host *host)
> > > diff --git a/drivers/mmc/core/quirks.h b/drivers/mmc/core/quirks.h
> > > index 472fa2fdcf13..7263187b6323 100644
> > > --- a/drivers/mmc/core/quirks.h
> > > +++ b/drivers/mmc/core/quirks.h
> > > @@ -99,6 +99,14 @@ static const struct mmc_fixup mmc_blk_fixups[] = {
> > > MMC_FIXUP("V10016", CID_MANFID_KINGSTON, CID_OEMID_ANY, add_quirk_mmc,
> > > MMC_QUIRK_TRIM_BROKEN),
> > >
> > > + /*
> > > + * Some Micron eMMC devices will not go into boot mode on
> > > + * CMD0 arg: 0XF0F0F0F0 after going into SLEEP state.
> > > + * This will hang a reboot.
> > > + */
> > > + MMC_FIXUP(CID_NAME_ANY, CID_MANFID_NUMONYX, 0x014e, add_quirk_mmc,
> > > + MMC_QUIRK_BROKEN_SLEEP),
> > > +
> > > END_FIXUP
> > > };
> > >
> > > diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
> > > index 7d46411ffaa2..0cdddcb5e17d 100644
> > > --- a/include/linux/mmc/card.h
> > > +++ b/include/linux/mmc/card.h
> > > @@ -270,6 +270,7 @@ struct mmc_card {
> > > #define MMC_QUIRK_BROKEN_IRQ_POLLING (1<<11) /* Polling SDIO_CCCR_INTx could create a fake interrupt */
> > > #define MMC_QUIRK_TRIM_BROKEN (1<<12) /* Skip trim */
> > > #define MMC_QUIRK_BROKEN_HPI (1<<13) /* Disable broken HPI support */
> > > +#define MMC_QUIRK_BROKEN_SLEEP (1<<14) /* Broken sleep mode */
> > >
> > > bool reenable_cmdq; /* Re-enable Command Queue */
> > >
> > > --
> > > 2.17.1
> > >