Re: open() on /dev/tty takes 30 seconds on 2.6.36

From: Arnd Bergmann
Date: Sat Oct 30 2010 - 23:35:56 EST


On Sunday 31 October 2010, Eli Billauer wrote:
> Andrew Morton wrote:
>
> First, I'll explain what I did: Since I can't login while this problem
> occurs, I had to modify sshd's service script to start sshd with strace,
> and also kick off a script (called who-is-locking.sh), which creates
> that sysrq call every five seconds. This could also be the place to
> mention, that I got an NULL dereference oops during one of these (IP at
> get_usage_chars+0x0/0x10e, while in the who-is-locking process). But
> that happened after the desired data was logged.

An oops report is interesting most of the time, except when you know
that you caused it with your own kernel changes. If you can still
find the oops report, please post it here.

> Oct 31 01:14:28 ocho kernel: Serial: 8250/16550 driver, 4 ports, IRQ
> sharing enabled
> Oct 31 01:14:28 ocho kernel: serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is
> a 16550A
> Oct 31 01:14:28 ocho kernel: 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> Oct 31 01:14:28 ocho kernel: serial 0000:05:04.0: PCI INT A -> GSI 16
> (level, low) -> IRQ 16
> Oct 31 01:14:28 ocho kernel: 0000:05:04.0: ttyS1 at I/O 0xbe20 (irq =
> 16) is a 8250
> Oct 31 01:14:28 ocho kernel: 0000:05:04.0: ttyS2 at I/O 0xbe28 (irq =
> 16) is a 8250
> Oct 31 01:14:28 ocho kernel: 0000:05:04.0: ttyS3 at I/O 0xbe40 (irq =
> 16) is a 8250
> Oct 31 01:14:28 ocho kernel: Couldn't register serial port 0000:05:04.0: -28

For some reason, your kernel finds five ports at boot time, while it's
been configured at build time to support only four.

> Oct 31 01:23:59 ocho kernel: =============================================
> Oct 31 01:23:59 ocho kernel:
> Oct 31 01:24:04 ocho kernel: SysRq : Show Locks Held
> Oct 31 01:24:04 ocho kernel:
> Oct 31 01:24:04 ocho kernel: Showing all locks held in the system:
> Oct 31 01:24:04 ocho kernel: 2 locks held by modem-manager/1835:
> Oct 31 01:24:04 ocho kernel: #0: (big_tty_mutex){+.+.+.}, at:
> [<ffffffff814b9c09>] tty_lock+0x3f/0x4a
> Oct 31 01:24:04 ocho kernel: #1: (&port->mutex){+.+.+.}, at:
> [<ffffffff812fcace>] uart_close+0x82/0x312

So modem-manager is the process you need to look at, as it holds the
BTM, probably for a long time. It should not do that. First you should
verify that when you don't have modem-manager running, the problem goes
away. This may indeed be related to the misdetection of a serial port.

The other thing to check is what modem-manager is actually doing. You can
find out by looking at /proc/<pid>/stack, where <pid> is the pid of the
modem-manager process, 1835 in your example.

You can also look at <sysrq>+t to get all the stack traces for the running
processes and finding the process in there.

I think you're closing in on the bug, thanks to your providing exactly the
right parts of information.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/