Re: [PATCH] mmc: dw_mmc: Disable SDIO interrupts while suspended to fix suspend/resume

From: Doug Anderson
Date: Fri Apr 26 2019 - 18:08:58 EST


Hi,

On Fri, Apr 26, 2019 at 10:19 AM Emil Renner Berthing
<emil.renner.berthing@xxxxxxxxx> wrote:
>
> Hi Doug,
>
> TLDR: I'm no longer convinced this patch breaks suspend/resume more
> than it already is. Sorry about the noise.
>
> On Thu, 25 Apr 2019 at 23:25, Doug Anderson <dianders@xxxxxxxxxxxx> wrote:
> >
> > Hi,
> >
> > On Wed, Apr 24, 2019 at 1:19 AM Emil Renner Berthing
> > <emil.renner.berthing@xxxxxxxxx> wrote:
> > >
> > > Hi Douglas,
> > >
> > > Unfortunately this seems to beak resume on my rk3399-gru-kevin. I have
> > > a semi-complicated setup with my rootfs as a btrfs on dmcrypt on
> > > mmcblk0 which is the dw_mmc, so I'm guessing something goes wrong when
> > > waking up the dm_mmc which probably wasn't suspended before this
> > > patch. It's not 100% consistent though. Sometimes I see it resume the
> > > first time I try suspending, but then 2nd time I suspend it won't come
> > > back.
> >
> > Thanks for testing!
>
> Thanks for your detailed response. It made me want to make absolutely
> sure that this patch is the culprit.
> As a baseline I booted a vanilla 5.0.9 and suspend/resumed it about a
> dusin times without any errors.
> So I applied this patch and immediately it crashed on suspend, but in
> a way that I could still see the kernel log,
> and it was the mwifiex driver that crashed. I rebooted and tried
> supend/resume again and
> this time it seemed like it was the dwc3 or usb3-phy that crashed.
> I still have the kernel log if anyone is interested.
> However 3rd time booting 5.0.9 with this patch suspend/resume just works.
> At least the 2 dusin times I tried before giving up on making it crash.
> I went back to vanilla 5.0.9 and after a few tries I managed to make
> that one crash too.
> I guess that means this patch is off the hook. I'm sorry about the
> false report :/

No worries, I've certainly been there and I'm super happy to have
people testing patches. :-)

Odd that you're having suspend/resume patches. My first guess for
super randomness would be WiFi. The PCIe bus on rk3399 causes the
most impossible to debug problems if you try to access it at the wrong
time. If you disable WiFi do all your problems go away? I tried
putting v5.0.9 on the kevin sitting on my desk and it seems to
suspend/resume OK (25 cycles), but:

* I just jammed it straight onto a normal Chrome OS root filesystem.
Since that filesystem expects the GPU to be there, I'm just booting to
a serial prompt and the screen just displays the boot splash.

* I didn't try to configure WiFi or anything.

* I'm using the Chrome OS "fallback config" for the kernel (the config
our build system picks if building an upstream kernel without the
normal split config). AKA:
<https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/master/eclass/cros-kernel/rockchip64_defconfig>.
I'm not 100% sure everything is enabled there...

* I'm booting w/ serial console enabled and doing my testing with "no
console suspend" which can certainly affect suspend/resume timing.


Best of luck tracking your problems down! I suppose if things used to
work maybe a bisect would be possible?

-Doug