Re: [PATCH] irqchip: brcmstb-l2: use _irqsave variants in non-interrupt code

From: Marc Zyngier
Date: Thu Feb 21 2019 - 05:15:48 EST


On Wed, 20 Feb 2019 14:15:28 -0800
Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:

> From: Doug Berger <opendmb@xxxxxxxxx>
>
> Using the irq_gc_lock/irq_gc_unlock functions in the suspend and
> resume functions creates the opportunity for a deadlock during
> suspend, resume, and shutdown. Using the irq_gc_lock_irqsave/
> irq_gc_unlock_irqrestore variants prevents this possible deadlock.
>
> Signed-off-by: Doug Berger <opendmb@xxxxxxxxx>
> Signed-off-by: Florian Fainelli <f.fainelli@xxxxxxxxx>

Applied to irqchip-next with:

Cc: stable@xxxxxxxxxxxxxxx
Fixes: 7f646e92766e2 ("irqchip: brcmstb-l2: Add Broadcom Set Top Box
Level-2 interrupt controller")

Now, I'm worried this is not the only issue, see below.

> ---
> drivers/irqchip/irq-brcmstb-l2.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/irqchip/irq-brcmstb-l2.c b/drivers/irqchip/irq-brcmstb-l2.c
> index 0e65f609352e..83364fedbf0a 100644
> --- a/drivers/irqchip/irq-brcmstb-l2.c
> +++ b/drivers/irqchip/irq-brcmstb-l2.c
> @@ -129,8 +129,9 @@ static void brcmstb_l2_intc_suspend(struct irq_data *d)
> struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
> struct irq_chip_type *ct = irq_data_get_chip_type(d);
> struct brcmstb_l2_intc_data *b = gc->private;
> + unsigned long flags;
>
> - irq_gc_lock(gc);
> + irq_gc_lock_irqsave(gc, flags);
> /* Save the current mask */
> b->saved_mask = irq_reg_readl(gc, ct->regs.mask);
>
> @@ -139,7 +140,7 @@ static void brcmstb_l2_intc_suspend(struct irq_data *d)
> irq_reg_writel(gc, ~gc->wake_active, ct->regs.disable);
> irq_reg_writel(gc, gc->wake_active, ct->regs.enable);
> }
> - irq_gc_unlock(gc);
> + irq_gc_unlock_irqrestore(gc, flags);
> }
>
> static void brcmstb_l2_intc_resume(struct irq_data *d)
> @@ -147,8 +148,9 @@ static void brcmstb_l2_intc_resume(struct irq_data *d)
> struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
> struct irq_chip_type *ct = irq_data_get_chip_type(d);
> struct brcmstb_l2_intc_data *b = gc->private;
> + unsigned long flags;
>
> - irq_gc_lock(gc);
> + irq_gc_lock_irqsave(gc, flags);
> if (ct->chip.irq_ack) {
> /* Clear unmasked non-wakeup interrupts */
> irq_reg_writel(gc, ~b->saved_mask & ~gc->wake_active,
> @@ -158,7 +160,7 @@ static void brcmstb_l2_intc_resume(struct irq_data *d)
> /* Restore the saved mask */
> irq_reg_writel(gc, b->saved_mask, ct->regs.disable);
> irq_reg_writel(gc, ~b->saved_mask, ct->regs.enable);
> - irq_gc_unlock(gc);
> + irq_gc_unlock_irqrestore(gc, flags);
> }
>
> static int __init brcmstb_l2_intc_of_init(struct device_node *np,


I've had a quick look at the generic irqchip code, and the mask/unmask
code seems to suffer from something similar. Both implementations use
the non irq-safe version, and seem vulnerable to the following scenario:

>From a non-interrupt context:

irq_set_status_flags(irq, IRQ_DISABLE_UNLAZY)
disable_irq(irq)
irq_disable(irqdesc, true)
irq_gc_mask_disable_reg(&irqdesc->irq_data)
irq_gc_lock()

interrupt fires here:

brcmstb_l2_intc_irq_handle()
generic_handle_irq()
handle_edge_irq()
mask_ack_irq()
brcmstb_l2_mask_and_ack()
irq_gc_lock() ----> deadlock

I haven't actually observed this, but it feels like it could happen.
This should just be a matter of turning the mask/unmask/set_wake
callbacks into the irq-safe version (see patch below).

Thomas, what do you think?

Thanks,

M.