Re: Lost MSIs during hibernate

From: Thomas Gleixner
Date: Tue Apr 05 2022 - 16:23:12 EST


Evan!

On Mon, Apr 04 2022 at 12:11, Evan Green wrote:
> To my surprise, I'm back with another MSI problem, and hoping to get
> some advice on how to approach fixing it.

Why am I not surprised?

> What worries me is those IRQ "no longer affine" messages, as well as
> my "EVAN don't touch hw" prints, indicating that requests to change
> the MSI are being dropped. These ignored requests are coming in when
> we try to migrate all IRQs off of the non-boot CPU, and they get
> ignored because all devices are "frozen" at this point, and presumably
> not in D0.

They are disabled at that point.

> To further try and prove that theory, I wrote a script to do the
> hibernate prepare image step in a loop, but messed with XHCI's IRQ
> affinity beforehand. If I move the IRQ to core 0, so far I have never
> seen a hang. But if I move it to another core, I can usually get a
> hang in the first attempt. I also very occasionally see wifi splats
> when trying this, and those "no longer affine" prints are all the wifi
> queue IRQs. So I think a wifi packet coming in at the wrong time can
> do the same thing.
>
> I wanted to see what thoughts you might have on this. Should I try to
> make a patch that moves all IRQs to CPU 0 *before* the devices all
> freeze? Sounds a little unpleasant. Or should PCI be doing something
> different to avoid this combination of "you're not allowed to modify
> my MSIs, but I might still generate interrupts that must not be lost"?

PCI cannot do much here and moving interrupts around is papering over
the underlying problem.

xhci_hcd 0000:00:0d.0: EVAN Write MSI 0 fee1e000 4023

This sets up the interrupt when the driver is loaded

xhci_hcd 0000:00:14.0: EVAN Write MSI 0 fee01000 4024

Ditto

xhci_hcd 0000:00:0d.0: calling pci_pm_freeze+0x0/0xad @ 423, parent: pci0000:00
xhci_hcd 0000:00:14.0: calling pci_pm_freeze+0x0/0xad @ 4644, parent: pci0000:00
xhci_hcd 0000:00:14.0: pci_pm_freeze+0x0/0xad returned 0 after 0 usecs
xhci_hcd 0000:00:0d.0: EVAN Write MSI 0 fee1e000 4023
xhci_hcd 0000:00:0d.0: pci_pm_freeze+0x0/0xad returned 0 after 196000 usecs

Those freeze() calls end up in xhci_suspend(), which tears down the XHCI
and ensures that no interrupts are on flight.

xhci_hcd 0000:00:0d.0: calling pci_pm_freeze_noirq+0x0/0xb2 @ 4645, parent: pci0000:00
xhci_hcd 0000:00:0d.0: pci_pm_freeze_noirq+0x0/0xb2 returned 0 after 30 usecs
xhci_hcd 0000:00:14.0: calling pci_pm_freeze_noirq+0x0/0xb2 @ 4644, parent: pci0000:00
xhci_hcd 0000:00:14.0: pci_pm_freeze_noirq+0x0/0xb2 returned 0 after 3118 usecs

Now the devices are disabled and not accessible

xhci_hcd 0000:00:14.0: EVAN Don't touch hw 0 fee00000 4024
xhci_hcd 0000:00:0d.0: EVAN Don't touch hw 0 fee1e000 4045
xhci_hcd 0000:00:0d.0: EVAN Don't touch hw 0 fee00000 4045
xhci_hcd 0000:00:14.0: calling pci_pm_thaw_noirq+0x0/0x70 @ 9, parent: pci0000:00
xhci_hcd 0000:00:14.0: EVAN Write MSI 0 fee00000 4024

This is the early restore _before_ the XHCI resume code is called
This interrupt is targeted at CPU0 (it's the one which could not be
written above).

xhci_hcd 0000:00:14.0: pci_pm_thaw_noirq+0x0/0x70 returned 0 after 5272 usecs
xhci_hcd 0000:00:0d.0: calling pci_pm_thaw_noirq+0x0/0x70 @ 1123, parent: pci0000:00
xhci_hcd 0000:00:0d.0: EVAN Write MSI 0 fee00000 4045

Ditto

xhci_hcd 0000:00:0d.0: pci_pm_thaw_noirq+0x0/0x70 returned 0 after 623 usecs
xhci_hcd 0000:00:14.0: calling pci_pm_thaw+0x0/0x7c @ 3856, parent: pci0000:00
xhci_hcd 0000:00:14.0: pci_pm_thaw+0x0/0x7c returned 0 after 0 usecs
xhci_hcd 0000:00:0d.0: calling pci_pm_thaw+0x0/0x7c @ 4664, parent: pci0000:00
xhci_hcd 0000:00:0d.0: pci_pm_thaw+0x0/0x7c returned 0 after 0 usecs

That means the suspend/resume logic is doing the right thing.

How the XHCI ends up being confused here is a mystery. Cc'ed a few more folks.

Thanks,

tglx