RE: [PATCH] genirq: for edge interrupt IRQS_ONESHOT support withirq thread

From: Thomas Gleixner
Date: Fri Oct 12 2012 - 09:47:26 EST


On Fri, 12 Oct 2012, Liu, Chuansheng wrote:
> > -----Original Message-----
> > From: Thomas Gleixner [mailto:tglx@xxxxxxxxxxxxx]
> > Sent: Friday, October 12, 2012 8:46 PM
> > To: Liu, Chuansheng
> > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > Subject: RE: [PATCH] genirq: for edge interrupt IRQS_ONESHOT support with irq
> > thread
> >
> > On Fri, 12 Oct 2012, Liu, Chuansheng wrote:
> > > > -----Original Message-----
> > > > From: Thomas Gleixner [mailto:tglx@xxxxxxxxxxxxx]
> > > > Sent: Friday, October 12, 2012 8:32 PM
> > > > To: Liu, Chuansheng
> > > > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > > > Subject: Re: [PATCH] genirq: for edge interrupt IRQS_ONESHOT support
> > with irq
> > > > thread
> > > >
> > > > On Fri, 12 Oct 2012, Chuansheng Liu wrote:
> > > > >
> > > > > In our system, there is one edge interrupt, and we want it to be
> > > > > irq thread with IRQS_ONESHOT, and found in handle_edge_irq(),
> > > > > even with IRQS_ONESHOT, the irq is still unmasked without care of
> > > > > flag IRQS_ONESHOT.
> > > > >
> > > > > It causes IRQS_ONESHOT can not work well for edge interrupt, but also
> > > > > after the irq thread finished with flag IRQS_ONESHOT, the irq will be
> > > > > possible to be unmasked again, it should be messing mask/unmask logic.
> > > >
> > > > This is just wrong. By masking edge interrupts you will run into
> > > > situations where you will lose interrupts.
> > > >
> > > > Can you please explain, why you want to mask your edge interrupt?
> >
> > > When I request_irq with irq thread handler and flag IRQS_ONESHOT, if
> > > do not mask the edge interrupt, the primary handler and irq thread
> > > maybe run at the same time, and in my real case it causes spin
> > > deadlock.
> >
> > Then your code is simply wrong and you need to fix it instead of hacking a
> > workaround into the core code. Locking is not that hard.
>
> I got means, so I want to use flag IRQS_ONESHOT to avoid the case that two
> handlers running at the same time. Is it right direction?

Again, this does not work for edge type interrupts. Period. And we are
not adding something which is known to be broken.

Also there is no problem when a hard interrupt comes in while the
thread handler is running. It's just a matter of proper code and
proper locking.

> But IRQS_ONESHOT does not work well for edge interrupt.
> And pasting the IRQS_ONESHOT description:
> * IRQS_ONESHOT - irq is not unmasked in primary handler

Right, and edge type interrupts doe not support it.

> And I need irq handler is because some heavy work is needed, it can avoid local
> irq disabling time thru irq handler.

Disable the interrupt at the device level in your primary handler, but
do not try to impose something to the core code which is fundamentaly
wrong.

> >
> > > You means it is not right with IRQS_ONESHOT for edge interrupt?
> >
> > It's wrong. Simnply because you can lose interrupts.
> >
> > interrupt raised
> > handle_edge_irq()
> > mask_ack_irq()
> > handle_event()
> > wake irq thread
> > reti
> >
> > irq thread runs
> > handle device interrupt()
> > <--- device issues edge irq
> > unmask_irq()
> >
> > This interrupt is not delivered. So your device stops working. Not
> > what you want, right?

> Device should not stop:) And even in current handle_edge_irq(), it
> is possible that losting Interrupt if primary handler need some time
> and the irq is quick enough. I says the below code, it just avoid

The code flow is:

Interrupt
ack()
handler()
RETI

After the ack another interrupt can come in. It's raised in the CPU,
but it cannot be delivered because the CPU is running that very
interrupt at this point with interupts disabled. After RETI this
interrupt is delivered and runs the edge handler again. That's on UP.

On SMP an interrupt which is raised after the ack() again before the
handler finishes, can invoke another delivery on a different CPU,
which then sees the IRQ_INPROGESS flag, masks it and flags it
PENDING. When the primary handler on the first CPU returns, it sees
the PENDING flag, unmasks and invokes the handler another time.

So nothing gets lost. Now you mask it and if you look at the flow I
showed in my last mail, then your device will be stuck. Simply because
the interrupt was delivered while the line was masked at the irq chip
level which causes a drop. That's a property of edge type interrupts
and we have proper code to deal with it. No way to change that just
that you can avoid to fix your broken driver design.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/