Re: [RFC PATCH v3 07/12] genirq: Shutdown irq chips in suspend/resume during hibernation

From: Thomas Gleixner
Date: Fri Mar 06 2020 - 18:04:32 EST


Anchal Agarwal <anchalag@xxxxxxxxxx> writes:

> There are no pm handlers for the legacy devices, so during tear down
> stale event channel <> IRQ mapping may still remain in the image and
> resume may fail. To avoid adding much code by implementing handlers for
> legacy devices, add a new irq_chip flag IRQCHIP_SHUTDOWN_ON_SUSPEND which
> when enabled on an irq-chip e.g xen-pirq, it will let core suspend/resume
> irq code to shutdown and restart the active irqs. PM suspend/hibernation
> code will rely on this.
> Without this, in PM hibernation, information about the event channel
> remains in hibernation image, but there is no guarantee that the same
> event channel numbers are assigned to the devices when restoring the
> system. This may cause conflict like the following and prevent some
> devices from being restored correctly.

The above is just an agglomeration of words and acronyms and some of
these sentences do not even make sense. Anyone who is not aware of event
channels and whatever XENisms you talk about will be entirely
confused. Changelogs really need to be understandable for mere mortals
and there is no space restriction so acronyms can be written out.

Something like this:

Many legacy device drivers do not implement power management (PM)
functions which means that interrupts requested by these drivers stay
in active state when the kernel is hibernated.

This does not matter on bare metal and on most hypervisors because the
interrupt is restored on resume without any noticable side effects as
it stays connected to the same physical or virtual interrupt line.

The XEN interrupt mechanism is different as it maintains a mapping
between the Linux interrupt number and a XEN event channel. If the
interrupt stays active on hibernation this mapping is preserved but
there is unfortunately no guarantee that on resume the same event
channels are reassigned to these devices. This can result in event
channel conflicts which prevent the affected devices from being
restored correctly.

One way to solve this would be to add the necessary power management
functions to all affected legacy device drivers, but that's a
questionable effort which does not provide any benefits on non-XEN
environments.

The least intrusive and most efficient solution is to provide a
mechanism which allows the core interrupt code to tear down these
interrupts on hibernation and bring them back up again on resume. This
allows the XEN event channel mechanism to assign an arbitrary event
channel on resume without affecting the functionality of these
devices.

Fortunately all these device interrupts are handled by a dedicated XEN
interrupt chip so the chip can be marked that all interrupts connected
to it are handled this way. This is pretty much in line with the other
interrupt chip specific quirks, e.g. IRQCHIP_MASK_ON_SUSPEND.

Add a new quirk flag IRQCHIP_SHUTDOWN_ON_SUSPEND and add support for
it the core interrupt suspend/resume paths.

Hmm?

> Signed-off-by: Anchal Agarwal <anchalag@xxxxxxxxxx>
> Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

Not that I care much, but now that I've written both the patch and the
changelog you might change that attribution slightly. For completeness
sake:

Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

Thanks,

tglx