Recursive/circular locking in serial8250_console_write/serial8250_do_startup

From: Guenter Roeck
Date: Wed Aug 12 2020 - 11:48:16 EST


Hi,

crbug.com/1114800 reports a hard lockup due to circular locking in the
8250 console driver. This is seen if CONFIG_PROVE_LOCKING is enabled.

Problem is as follows:
- serial8250_do_startup() locks the serial (console) port.
- serial8250_do_startup() then disables interrupts if interrupts are
shared, by calling disable_irq_nosync().
- disable_irq_nosync() calls __irq_get_desc_lock() to lock the interrupt
descriptor.
- __irq_get_desc_lock() calls lock_acquire()
- If CONFIG_PROVE_LOCKING is enabled, validate_chain() and check_noncircular()
are called and identify a potential locking error.
- This locking error is reported via printk, which ultimately calls
serial8250_console_write().
- serial8250_console_write() tries to lock the serial console port.
Since it is already locked, the system hangs and ultimately reports
a hard lockup.

I understand we'll need to figure out and fix what lockdep complains about,
and I am working on that. However, even if that is fixed, we'll need a
solution for the recursive lock: Fixing the lockdep problem doesn't
guarantee that a similar problem (or some other log message) won't be
detected and reported sometime in the future while serial8250_do_startup()
holds the console port lock.

Ideas, anyone ? Everything I came up with so far seems clumsy and hackish.

Thanks,
Guenter