Re: [PATCH] 8250: option 'force_polling' for buggy IRQs

From: Eric W. Biederman
Date: Fri Jul 29 2016 - 13:51:29 EST


Max Staudt <mstaudt@xxxxxxx> writes:

> On 07/29/2016 11:23 AM, One Thousand Gnomes wrote:
>>> Serial consoles are already polled for output. So nothing should
>>> care until userspace starts, and the full serial driver initializes.
>>
>> At which point it hangs
>
> Yep, because the IRQ is never firing. It isn't screaming at all. :)
>
>
>>> So I suspect either "irqfixup" or "irqpoll" would handle this for you.
>>> If not I am certain a small tweak to some of that code would work.
>>
>> irqfixup won't usually help but irqpoll with HZ=1000 ought to, although
>> it has its own set of problems because not all devices with non shared
>> IRQ lines take kindly to irqpoll.

It might make sense to filter non-shared edge triggered interrupts out
of irqpoll for that reason. Anything that supports a level triggered
interrupt should be fine.

> Hmm, the kernel is compiled as tickless. I tried booting with
> "irqpoll nohz=off" but that didn't help.
>
>
> What I could try is to build an option like "irqfire=4,1000" which would
> simulate an IRQ on line 4 at 1000 HZ and call the handler every time.
> Whether the handling driver likes it is a different question though.
>
> It sounds like "irqpoll" would do something similar, but based on the
> kernel's global HZ setting, and calling all handlers unconditionally.
> "irqfire" would be more specific.
>
> What do you think?
> Would this be useful for other broken systems, too?

I think so. I think I would go simpler and start a simple recurring
timer in the irqpoll case.

All that is really important is that it is generally reliable and it isn't
too hard to make work. Which makes me worry a little bit about your
irqfire example (aka someone has to figure out which irq is not firing),
which might be hard if you can't log in.

But shrug. You are writing the patch. I am just pointing out where we
have similar work arounds already and where another workaround to cover
your case (and to help others) would likely be appreciated in the kernel.

Eric