Re: [tip: irq/core] x86: Select HARDIRQS_SW_RESEND on x86

From: Thomas Gleixner
Date: Thu Mar 12 2020 - 11:55:35 EST


Linus Walleij <linus.walleij@xxxxxxxxxx> writes:
> On Wed, Mar 11, 2020 at 10:42 PM tip-bot2 for Hans de Goede
> Just help me understand the semantics of this thing...
>
> According to the text in KConfig:
>
> # Tasklet based software resend for pending interrupts on enable_irq()
> config HARDIRQS_SW_RESEND
> bool
>
> According to
> commit a4633adcdbc15ac51afcd0e1395de58cee27cf92
>
> [PATCH] genirq: add genirq sw IRQ-retrigger
>
> Enable platforms that do not have a hardware-assisted
> hardirq-resend mechanism
> to resend them via a softirq-driven IRQ emulation mechanism.
>
> so when enable_irq() is called, if the IRQ is already asserted,
> it will be distributed in the form of a software irq?
>
> OK I give up I don't understand the semantics of this thing.

Level type interrupts are "resending" in hardware as long as the device
interrupt is still asserted.

The problem are edge interrupts.

When an edge interrupt is disabled via disable_irq() the core does
not mask the chip because if the device raises an interrupt not all
interrupt chips latch that and forward it to the CPU on unmask,
i.e. some interrupt chips simply ignore an etch when the line is
masked.

So when the device raises an edge while the interrupt is disabled
the core still handles the hardware interrupt and:

- masks the interrupt line
- sets the pending bit
- does not invoke the device handler

On enable_irq() the pending bit is checked and if set the interrupt
is tried to be retriggered or resent, but only if it's edge type.

So if the interrupt chip provides a irq_retrigger() callback the
core uses that and only if this fails or is not available it resorts
to software "resend" which means queueing it for execution in
tasklet context.

> I see that ARM and ARM64 simply just select this. What
> happens if you do that and why is x86 not selecting it in general?

irq resending on X86 is not really problem free for interrupts
which are directly connect to the local APIC. The only way which is
halfways safe is the hardware retrigger. See

https://lkml.kernel.org/r/20200306130623.590923677@xxxxxxxxxxxxx
https://lkml.kernel.org/r/20200306130623.684591280@xxxxxxxxxxxxx

for the gory details. The GPIO interrupts which hang off behind some
slow bus or are multiplexed in other ways are not affected by this
hardware design induced madness.

As I don't know how many other architectures have trainwrecked interrupt
delivery mechanisms (IA64 definitely does), I'm more than reluctant to
inflict this on the world unconditionally.

Hope that helps.

Thanks,

tglx