Re: [PATCH AUTOSEL 6.15 092/110] genirq: Retain disable depth for managed interrupts across CPU hotplug

From: Sasha Levin
Date: Wed Jun 18 2025 - 11:12:01 EST

Next message: Gregory Price: "Re: [RFC PATCH v8 4/7] mm/mempolicy: Export memory policy symbols"
Previous message: Neeraj Sanjay Kale: "Re: [PATCH v2 1/2] dt-bindings: net: bluetooth: nxp: Add support for 4M baudrate"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Jun 06, 2025 at 02:32:29PM +0200, Johan Hovold wrote:

On Sun, Jun 01, 2025 at 07:24:14PM -0400, Sasha Levin wrote:

From: Brian Norris <briannorris@xxxxxxxxxxxx>

[ Upstream commit 788019eb559fd0b365f501467ceafce540e377cc ]

Affinity-managed interrupts can be shut down and restarted during CPU
hotunplug/plug. Thereby the interrupt may be left in an unexpected state.
Specifically:

1. Interrupt is affine to CPU N
2. disable_irq() -> depth is 1
3. CPU N goes offline
4. irq_shutdown() -> depth is set to 1 (again)
5. CPU N goes online
6. irq_startup() -> depth is set to 0 (BUG! driver expects that the interrupt
still disabled)
7. enable_irq() -> depth underflow / unbalanced enable_irq() warning

This is only a problem for managed interrupts and CPU hotplug, all other
cases like request()/free()/request() truly needs to reset a possibly stale
disable depth value.

Provide a startup function, which takes the disable depth into account, and
invoked it for the managed interrupts in the CPU hotplug path.

This requires to change irq_shutdown() to do a depth increment instead of
setting it to 1, which allows to retain the disable depth, but is harmless
for the other code paths using irq_startup(), which will still reset the
disable depth unconditionally to keep the original correct behaviour.

A kunit tests will be added separately to cover some of these aspects.

[ tglx: Massaged changelog ]

Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Brian Norris <briannorris@xxxxxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Link: https://lore.kernel.org/all/20250514201353.3481400-2-briannorris@xxxxxxxxxxxx
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

This one breaks suspend of laptops like the Lenovo ThinkPad T14s. Issue
was just reported here by Alex:

https://lore.kernel.org/lkml/24ec4adc-7c80-49e9-93ee-19908a97ab84@xxxxxxxxx/

Please drop from all stable queues for now.

Will do, thanks!

--
Thanks,
Sasha

Next message: Gregory Price: "Re: [RFC PATCH v8 4/7] mm/mempolicy: Export memory policy symbols"
Previous message: Neeraj Sanjay Kale: "Re: [PATCH v2 1/2] dt-bindings: net: bluetooth: nxp: Add support for 4M baudrate"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]