Re: possible recursive locking detected n_tty_write vs. process_echoes

From: Peter Hurley
Date: Fri Feb 06 2015 - 08:31:08 EST


Hi Andreas,

On 02/06/2015 07:03 AM, Andreas Starzer wrote:
> With enabled deadlock detection the following warning occurs when
> connected via ssh (Dropbear server v2015.67).
> With "while :; do dmesg -c; done" the problem was fast reproduceable.
>
> System: Kernel with RT-patch applied:
> Linux version 3.10.63-rt65 (gcc version 4.9.2 (iMX6) ) #1 SMP PREEMPT RT
>
> This seems to influence the RT-behavior somehow!!!

This recursion just needs lockdep annotation in the RT patch.

Each pty end has its own output_lock, and its not endlessly recursive
because only the slave end can echo. So if the slave end is
assigned a different lockdep subclass then the splat will go away.

This is how all recursive pty locks are now handled in 3.20; ie.,
with fixed lockdep subclasses.

The recursion doesn't happen in mainline because input is handled
by an autonomous worker (iow, tty_flip_buffer_push() doesn't execute
flush_to_ldisc() directly).

I would think the jitter is related to lockdep checking which is
disabled on first splat.

Regards,
Peter Hurley


> I made a 1ms nano-sleep timer with jitter observation and the jitter
> is quite huge until this warning occured.
>
> After this warning showed up once the jitter is gone!!!
>
> Without Full-Preemption the warning does not show up. (Tried: Desktop
> / Low-Latency desktop / Basic RT)
>
> Please help!
>
> CPU: Freescale i.MX6 Quad
> The jitter observer is a simple kthread with prio 60 (so above IRQs)
> which does hrtimer_nanosleep for 1ms and measures the elapsed time
> with getrawmonotonic.
>
>
> [ 416.153094]
> [ 416.153097] =============================================
> [ 416.153099] [ INFO: possible recursive locking detected ]
> [ 416.153106] 3.10.63-rt65-svn68 #1 Tainted: G O
> [ 416.153108] ---------------------------------------------
> [ 416.153113] dropbear/306 is trying to acquire lock:
> [ 416.153151] (&ldata->output_lock){+.+...}, at: [<802bc728>]
> process_echoes+0x48/0x2b4
> [ 416.153154]
> [ 416.153154] but task is already holding lock:
> [ 416.153168] (&ldata->output_lock){+.+...}, at: [<802bce2c>]
> n_tty_write+0x148/0x464
> [ 416.153169]
> [ 416.153169] other info that might help us debug this:
> [ 416.153172] Possible unsafe locking scenario:
> [ 416.153172]
> [ 416.153174] CPU0
> [ 416.153176] ----
> [ 416.153180] lock(&ldata->output_lock);
> [ 416.153185] lock(&ldata->output_lock);
> [ 416.153187]
> [ 416.153187] *** DEADLOCK ***
> [ 416.153187]
> [ 416.153189] May be due to missing lock nesting notation
> [ 416.153189]
> [ 416.153193] 2 locks held by dropbear/306:
> [ 416.153209] #0: (&tty->atomic_write_lock){+.+...}, at:
> [<802b9b04>] tty_write_lock+0x1c/0x5c
> [ 416.153223] #1: (&ldata->output_lock){+.+...}, at: [<802bce2c>]
> n_tty_write+0x148/0x464
> [ 416.153225]
> [ 416.153225] stack backtrace:
> [ 416.153232] CPU: 3 PID: 306 Comm: dropbear Tainted: G O
> 3.10.63-rt65-svn68 #1
> [ 416.153237] Backtrace:
> [ 416.153261] [<80011a60>] (dump_backtrace+0x0/0x108) from
> [<80011c70>] (show_stack+0x18/0x1c)
> [ 416.153273] r6:809dac30 r5:808b7944 r4:809dac30 r3:00000000
> [ 416.153290] [<80011c58>] (show_stack+0x0/0x1c) from [<80615888>]
> (dump_stack+0x24/0x28)
> [ 416.153313] [<80615864>] (dump_stack+0x0/0x28) from [<80071c88>]
> (__lock_acquire+0x1d04/0x2018)
> [ 416.153325] [<8006ff84>] (__lock_acquire+0x0/0x2018) from
> [<80072790>] (lock_acquire+0x68/0x7c)
> [ 416.153340] [<80072728>] (lock_acquire+0x0/0x7c) from [<80618e74>]
> (_mutex_lock+0x38/0x48)
> [ 416.153350] r7:8c9d1000 r6:8c9d0000 r5:8cd5d58e r4:8c9d12a8
> [ 416.153364] [<80618e3c>] (_mutex_lock+0x0/0x48) from [<802bc728>]
> (process_echoes+0x48/0x2b4)
> [ 416.153368] r4:8c9d12f8
> [ 416.153381] [<802bc6e0>] (process_echoes+0x0/0x2b4) from
> [<802beb1c>] (n_tty_receive_buf+0x1040/0x1044)
> [ 416.153396] [<802bdadc>] (n_tty_receive_buf+0x0/0x1044) from
> [<802c26e4>] (flush_to_ldisc+0x11c/0x16c)
> [ 416.153407] [<802c25c8>] (flush_to_ldisc+0x0/0x16c) from
> [<802c2778>] (tty_flip_buffer_push+0x44/0x48)
> [ 416.153419] [<802c2734>] (tty_flip_buffer_push+0x0/0x48) from
> [<802c38a0>] (pty_write+0x5c/0x6c)
> [ 416.153425] r5:8c573800 r4:00000001
> [ 416.153436] [<802c3844>] (pty_write+0x0/0x6c) from [<802bce44>]
> (n_tty_write+0x160/0x464)
> [ 416.153446] r6:8c573800 r5:8c9d1400 r4:00000001 r3:802c3844
> [ 416.153459] [<802bcce4>] (n_tty_write+0x0/0x464) from [<802b9c60>]
> (tty_write+0x11c/0x2e0)
> [ 416.153475] [<802b9b44>] (tty_write+0x0/0x2e0) from [<800d1e10>]
> (vfs_write+0xb8/0x194)
> [ 416.153484] [<800d1d58>] (vfs_write+0x0/0x194) from [<800d2394>]
> (SyS_write+0x44/0x80)
> [ 416.153498] r9:00000000 r8:00000000 r7:013002cc r6:00000000 r5:00000001
> [ 416.153498] r4:8c6cf540
> [ 416.153511] [<800d2350>] (SyS_write+0x0/0x80) from [<8000e580>]
> (ret_fast_syscall+0x0/0x48)
> [ 416.153524] r9:8c9a8000 r8:8000e744 r7:00000004 r6:012f9128 r5:00000001
> [ 416.153524] r4:012f9668
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/