Re: [PREEMPT_RT] 8250 IRQ lockup when flooding serial console (was Re: [ANNOUNCE] v5.4.28-rt19)

From: Sebastian Andrzej Siewior
Date: Thu Apr 23 2020 - 12:07:12 EST


On 2020-04-23 12:45:59 [+0200], To Jiri Kosina wrote:
> On 2020-04-23 11:12:59 [+0200], Jiri Kosina wrote:
> > On Thu, 23 Apr 2020, Jiri Kosina wrote:
> >
> > > > I'm pleased to announce the v5.4.28-rt19 patch set.
> > >
> > > First, I don't believe this is necessarily a regression coming with this
> > > particular version, but this is the first kernel where I tried this and it
> > > crashed.
> >
> > I just tried with 5.6.4-rt3, and I can make it explode exactly the same
> > way:
>
> I though I dealt with it. In the past it triggered also with threadirqs
> on !RT but this isn't the case anymore. It still explodes on RT. Let me
> lookâ

So it also happens with !RT, you just have to try a little harder. For
instance in drivers/tty/serial/8250/8250_core.c making the PASS_LIMIT
change apply to !RT and boom.

The IRQ4 is edge and in charge of ttyS0. It is handled by
handle_edge_irq() and after ->irq_ack(), the thread is woken up and then
we get another ->handle_edge_irq() for IRQ4. With larger PASS_LIMIT the
thread runs longer so note_interrupt() will make less IRQ_HANDLED based
on ->threads_handled_last. If it observes 100 handled within 100000
interrupts then the counters are reset again. On !RT it usually manages
to get >100 per 100000 interrupts so it appears good. On RT it gets less
and the interrupt gets disabled.

So it is not RT related, but RT triggers it more reliably (also the
PASS_LIMIT change can vanish).
I can't tell if this is a qemu bug in emulating the HW or not. I can't
reproduce it real HW. I see a second edge interrupt only after the
thread completed. I can't tell if this is because it is a real UART and
the data is flowing slower or because the edge-IRQ is not triggered
repeatedly.

Sebastian