Re: Race between MMIO writes and level IRQs

From: Russell King - ARM Linux admin
Date: Thu Jun 06 2019 - 06:29:08 EST


On Thu, Jun 06, 2019 at 11:53:05AM +0200, Marc Gonzalez wrote:
> Hello everyone,
>
> There's something about interrupts I have never quite understood,
> which I'd like to clear up once and for all. What I'm about to write
> will probably sound trivial to anyone's who's already figured it out,
> but I need to walk through it.
>
> Consider a device, living on some peripheral bus, with an interrupt
> line flowing from the device into some kind of interrupt controller.
>
> I.e. there are two "communication channels"
> 1) the peripheral bus, and 2) the "out-of-band" interrupt line.
>
> At some point, the device requires the CPU to do $SOMETHING. It sends
> a signal over the interrupt line (either a pulse for edge interrupts,
> or keeping the line high for level interrupts). After some time, the
> CPU will "take the interrupt", mask all(?) interrupts, and jump to the
> proper interrupt service routine (ISR).
>
> The CPU does whatever it's supposed to do, and then needs to inform
> the device that "yes, the work is done, stop pestering me". Typically,
> this is done by writing some value to one of the device's registers.
>
> AFAICT, this is the part where things can go wrong:
>
> The CPU issues the magic MMIO write, which will take some time to reach
> the device over the peripheral bus. Meanwhile, the device maintains the
> IRQ signal (assuming a level interrupt). Once the CPU leaves the ISR, the
> framework will unmask IRQs. If the write has not yet reached the device,
> the CPU will be needlessly interrupted again.
>
> Basically, there's a race between the MMIO write and the IRQ unmasking.
> We'd like to be able to guarantee that the MMIO write is complete before
> unmasking interrupts, right?
>
> Some people use memory barriers, but my understanding is that this is
> not sufficient. The memory barrier may guarantee that the MMIO write
> has left the CPU "domain", but not that it has reached the device.
>
> Am I mistaken?

You are not mistaken, and this issue has been known for a long time for
busses such as PCI, where writes are "posted" - they can be delayed by
any PCI bridge. The PCI ordering rules state that a MMIO write must
complete before a MMIO read is allowed, so the way drivers work around
this problem is to use a write-readback sequence where its important
that the write must hit the device in a timely manner.

> So it looks like the only surefire way to guarantee that the MMIO write
> has reached the device is to read the value back from the device?
>
> Tangential: is this one of the issues solved by MSI?
> https://en.wikipedia.org/wiki/Message_Signaled_Interrupts#Advantages

If anything, MSI is even worse - for example, if you disable the
interrupt at a device by writing and then reading back, an interrupt
could be "in flight" via the MSI mechanism just before the write hits,
but the CPU receives the interrupt after the read-back.

Note that the race condition the wikipedia article talks about is
between the device DMAing _to_ memory and the device sending a MSI.
It is not about the CPU writing to the device and the device sending
a MSI.

However, there is another aspect to consider - in a SMP system, the
interrupt could be processed by another CPU, so merely relying on
writing a peripheral register to stop an interrupt may mean that the
interrupt handler is already executing on some other CPU.
synchronize_irq() helps to avoid that. Note that it doesn't help with
the MSI issue.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up