bisected: ttyS panic on pa-risc

From: Meelis Roos
Date: Thu Jan 10 2019 - 10:54:58 EST


My HP 9000 A500 (pa-risc architecture) paniced in 5.0-rc1. It happened after printing dmesg lines about ttyS and before moving on to scsi printk-s.
I bisected it and the panic symptoms changed during that (some had backtrace, some had just panic).

This is one of the crashes I got:
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
serial 0000:00:04.0: enabling device (0146 -> 0147)
printk: console [ttyS0] disabled

0000:00:04.0: ttyS0 at MMIO 0xfffffffff8000000 (irq = 21, base_baud = 115200) is a 16550A
printk: console [ttyS0] enabled
printk: console [ttyS0] enabled
printk: bootconsole [ttyB0] disabled
printk: bootconsole [ttyB0] disabled
0000:00:04.0: ttyS1 at MMIO 0xfffffffff8000008 (irq = 21, base_baud = 115200) is a 16550A
0000:00:04.0: ttyS2 at MMIO 0xfffffffff8000010 (irq = 21, base_baud = 115200) is a 16550A
serial 0000:00:05.0: enabling device (0140 -> 0143)
0000:00:05.0: ttyS3 at MMIO 0xfffffffff8005000 (irq = 22, base_baud = 115200) is a 16550A
Backtrace:
[<0000000040502268>] pciserial_init_ports+0x128/0x240
[<00000000405040b8>] pciserial_init_one+0x1e0/0x2f0
[<00000000404b2b8c>] pci_device_probe+0xfc/0x180
[<0000000040513958>] really_probe+0x268/0x3d0
[<0000000040513d28>] driver_probe_device+0xf8/0x100
[<0000000040513e54>] __driver_attach+0x124/0x130
[<0000000040510dc4>] bus_for_each_dev+0x9c/0xe8
[<0000000040513040>] driver_attach+0x28/0x38
[<00000000405128c0>] bus_a

Normal dmesg excerpt from working kernel before the problem:

[ 6.746131] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 6.771772] serial 0000:00:04.0: enabling device (0146 -> 0147)
[ 6.792657] printk: console [ttyS0] disabled
[ 6.829825] 0000:00:04.0: ttyS0 at MMIO 0xfffffffff8000000 (irq = 21, base_baud = 115200) is a 16550A
[ 6.837151] printk: console [ttyS0] enabled
[ 6.877768] printk: bootconsole [ttyB0] disabled
[ 6.904352] 0000:00:04.0: ttyS1 at MMIO 0xfffffffff8000008 (irq = 21, base_baud = 115200) is a 16550A
[ 6.961051] 0000:00:04.0: ttyS2 at MMIO 0xfffffffff8000010 (irq = 21, base_baud = 115200) is a 16550A
[ 6.969881] serial 0000:00:05.0: enabling device (0000 -> 0003)
[ 7.004160] serial 0000:00:05.0: enabling SERR and PARITY (0003 -> 0143)
[ 7.030298] 0000:00:05.0: ttyS3 at MMIO 0xfffffffff8005000 (irq = 22, base_baud = 115200) is a 16550A
[ 7.041663] serial 0000:00:05.0: Couldn't register serial port 0, irq 22, type 2, error -28
[ 7.145456] sym53c8xx 0000:00:01.0: enabling device (0000 -> 0003)


Bisection leads to this commit:

6d7f677a2afa1c82d7fc7af7f9159cbffd5dc010 is the first bad commit
commit 6d7f677a2afa1c82d7fc7af7f9159cbffd5dc010
Author: Darwin Dingel <darwin.dingel@xxxxxxxxxxxxxxxxxxx>
Date: Mon Dec 10 11:29:09 2018 +1300

serial: 8250: Rate limit serial port rx interrupts during input overruns

When a serial port gets faulty or gets flooded with inputs, its interrupt
handler starts to work double time to get the characters to the workqueue
for the tty layer to handle them. When this busy time on the serial/tty
subsystem happens during boot, where it is also busy on the userspace
trying to initialise, some processes can continuously get preempted
and will be on hold until the interrupts subside.

The fix is to backoff on processing received characters for a specified
amount of time when an input overrun is seen (received a new character
before the previous one is processed). This only stops receive and will
continue to transmit characters to serial port. After the backoff period
is done, it receive will be re-enabled. This is optional and will only
be enabled by setting 'overrun-throttle-ms' in the dts.

Signed-off-by: Darwin Dingel <darwin.dingel@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

:040000 040000 4ea6cd68ededa0c9ffaa218668ffeb35557070a5 a011db1916fbf5cfdcfff836a81e4fb5ee737003 M drivers
:040000 040000 b1b1dc977965eb2db6b2cc79939446a1cf2f684d 41322ab1c199f504cfcc5b2ca211b4638d41351c M include


--
Meelis Roos <mroos@xxxxxxxx>