Re: Problem with commit bf22ff45bed664aefb5c4e43029057a199b7070c

From: Thomas Gleixner
Date: Wed Jul 12 2017 - 16:21:48 EST


On Wed, 12 Jul 2017, Thomas Gleixner wrote:
> On Mon, 10 Jul 2017, Juergen Gross wrote:
> > It is based on suspend/resume framework. The main work to be done
> > additionally is to disconnect from the pv-backends at save time and
> > connect to the pv-backends again at restore time.
> >
> > The main function triggering all that is xen_suspend() (as seen in
> > above backtrace).
>
> The untested patch below should give you hooks to do what you need to do.
>
> Add the irq_suspend/resume callbacks and set the IRQCHIP_GENERIC_SUSPEND
> flag on your xen irqchip, so it actually gets invoked.
>
> I have to make that opt in right now because the callbacks are used in the
> generic irqchip implementation already. We can revisit that when you can
> confirm that this is actually solving the problem.

There might be an even simpler solution.

As this is using the regular suspend_device_irqs() call, you just might get
away with setting IRQCHIP_MASK_ON_SUSPEND for your irq chip. That does not
use the lazy disable approach, it also masks all interrupts which are not
marked as wakeup irqs. I assume none of them is when you do that
save/restore dance.

That said, you still might make the whole mechanism cleaner by using the
irq chip callbacks so you can avoid traversing all the interrupts another
time. But I can't say for sure as I got lost in that xen event channel code.

Thanks,

tglx