Re: [RFC PATCH] irqchip/gic, gic-v3: Ensure data visibility in peripheral

From: Marc Zyngier
Date: Wed Sep 01 2021 - 03:04:27 EST


On Wed, 01 Sep 2021 07:31:15 +0100,
Leo Yan <leo.yan@xxxxxxxxxx> wrote:
>
> When an interrupt line is assered, GIC handles interrupt with the flow
> (with EOImode == 1):
>
> gic_handle_irq()
> `> do_read_iar() => Change int state to active
> `> gic_write_eoir() => Drop int priority
> `> handle_domain_irq()
> `> generic_handle_irq_desc()
> `> handle_fasteoi_irq()
> `> handle_irq_event() => Peripheral handler and
> de-assert int line
> `> cond_unmask_eoi_irq()
> `> chip->irq_eoi()
> `> gic_eoimode1_eoi_irq() => Change int state to inactive
>
> In this flow, it has no explicit memory barrier between the functions
> handle_irq_event() and chip->irq_eoi(), it's possible that the
> outstanding data has not reached device in handle_irq_event() but the
> callback chip->irq_eoi() is invoked, this can lead to state transition
> for level triggered interrupt:
>
> Flow | Interrupt state in GIC
> ---------------------------------+-------------------------------------
> Interrupt line is asserted | 'inactive' -> 'pending'
> do_read_iar() | 'pending' -> 'pending & active'
> handle_irq_event() | Write peripheral register but it's
> | not visible for device, so the
> | interrupt line is still asserted
> chip->irq_eoi() | 'pending & active' -> 'pending'
> ...
> Produce spurious interrupt |
> with interrupt ID: 1024 |

1024? Surely not.

> | Finally the peripheral reigster is
> | updated and the interrupt line is
> | deasserted: 'pending' -> 'inactive'
>
> To avoid this potential issue, this patch adds wmb() barrier prior to
> invoke EOI operation, this can make sure the interrupt line is
> de-asserted in peripheral before deactivating interrupt in GIC. At the
> end, this can avoid spurious interrupt.

If you want to ensure completion of device-specific writes, why isn't
this the job of the device driver to implement whatever semantic it
desires? What if the interrupt is (shock, horror!) driven by a system
register instead?

I think this is merely papering over a driver bug, and adds a
significant cost to all interrupts for no good reasons.

M.

--
Without deviation from the norm, progress is not possible.