Re: 5.14-rc failure to resume

From: Andy Shevchenko
Date: Sat Jul 24 2021 - 15:50:02 EST


On Sat, Jul 24, 2021 at 8:56 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
> On 7/24/21 9:57 AM, Jens Axboe wrote:

> > I ran into this when doing the last bit of testing on pending changes
> > for this release on the laptop. Outside of running testing on these
> > changes, I always build and boot current -git and my changes on my
> > laptop as well.
> >
> > 5.14-rc1 + changes works fine, current -git and changes fail to resume
> > every single time. I just get a black screen. Tip of tree before merging
> > fixes is:
> >
> > commit 704f4cba43d4ed31ef4beb422313f1263d87bc55 (origin/master, origin/HEAD, master)
> > Merge: 05daae0fb033 0077a5008272
> > Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> > Date: Fri Jul 23 11:30:12 2021 -0700
> >
> > Merge tag 'ceph-for-5.14-rc3' of git://github.com/ceph/ceph-client
> >
> > Since bisection takes forever on the laptop (gen7 x1 carbon), I
> > opportunistically reverted some of the most recent git pulls:
> >
> > - ec6badfbe1cde0eb2bec4a0b8f6e738171156b5b (acpi changes)
> > - 1d597682d3e669ec7021aa33d088ed3d136a5149 (driver-core changes)
> > - 74738c556db6c7f780a8b98340937e55b72c896a (usb changes)
> > - e7562a00c1f54116f5a058e7e3ddd500188f60b2 (sound changes)
> > - 8baef6386baaefb776bdd09b5c7630cf057c51c6 (drm changes)
> >
> > as they could potentially be involved, but even with all of those
> > reverted it still won't resume.
> >
> > Sending this out in case someone has already reported this and I just
> > couldn't find it. If this is a new/unknown issues, I'll go ahead and
> > bisect it.
>
> Ran a bisect, and it pinpoints:

Thanks for the report!

> 71f6428332844f38c7cb10461d9f29e9c9b983a0 is the first bad commit
> commit 71f6428332844f38c7cb10461d9f29e9c9b983a0
> Author: Andy Shevchenko <andy.shevchenko@xxxxxxxxx>
> Date: Mon Jul 12 21:21:21 2021 +0300
>
> ACPI: utils: Fix reference counting in for_each_acpi_dev_match()
>
> which seems odd, as it worked for me with the acpi changes reverted. It
> could be that it _sometimes_ works with that commit, not sure. Adding
> relevant folks to the CC.
>
> I'm going to revert this on top of current master and run with that
> and see if it does 10 successful resumes.

This commit touches two parts (and API) EFI for Apple devices (seems
not your case) and CIO2 bridge (Camera device on Intel Sky Lake and
Kaby Lake machines). The EFI code runs at boot time AFAIU and CIO2
code runs at device's ->probe() time. I'm a bit puzzled as to why it
affects resume parts... Daniel, any ideas?

--
With Best Regards,
Andy Shevchenko