Re: Linux 2.6.31-rc9

From: Linus Torvalds
Date: Tue Sep 08 2009 - 20:37:57 EST



[ Added some people to the cc - this is very directly related to the
previous thread on "v2.6.31-rc6: BUG: unable to handle kernel NULL
pointer dereference at 0000000000000008", and the deadlock discussion
there ]

On Tue, 8 Sep 2009, Ingo Molnar wrote:
>
> FYI, i'm getting very (very) rare warnings from the TTY code in this
> place:
>
> [ 28.187364] rc.sysinit used greatest stack depth: 5224 bytes left
> [ 31.422457] Adding 3911816k swap on /dev/sda2. Priority:-1 extents:1 across:3911816k
> [ 32.974830] ssh used greatest stack depth: 5200 bytes left
> [ 33.115028] ------------[ cut here ]------------
> [ 33.119518] WARNING: at drivers/char/tty_io.c:1267 __tty_open+0x3ef/0x4c0()

Hmm. I think I see why, and I _suspect_ this is harmless, although it's
obviously very annoying, and it really is indicative of a real locking
problem.

What's going on is that same horrible deadlocak-avoidance where we have to
drop the ldisc_mutex after clearing TTY_LDISC, in order to then wait for
any pending work. See commit 5c58ceff103d8a654f24769bb1baaf84a841b0cc,
which is probably also the one that introduced the timing that gets your
particular warning.

So when __tty_open() does this:

mutex_lock(&tty->ldisc_mutex);
WARN_ON(!test_bit(TTY_LDISC, &tty->flags));
mutex_unlock(&tty->ldisc_mutex);

it's really warning about something that really can happen: the things
that clear TTY_LDISC will all release the ldisc_mutex with that bit still
clear, because they all end up having to release the lock that they
_should_ hold in order to avoid a deadlock.

So the warning is "real" in the sense that it does show a real locking
problem. It's probably not _relevant_ in that it probably will never cause
any other issues in practice.

> I got it on two systems so far. Config attached (but is probably
> irrelevant). The warnings started in the .31 cycle. They occur every
> 1000-2000 random kernels - i.e. every few days.

Yeah, the configuration won't matter.

> These warnings were never fatal and my guess is that they are
> ancient, pre-existing races in the TTY code - but wanted to mention
> them here in case they matter.

The issue is pre-existing, yes - we've always done that

tty_ldisc_halt(tty);
flush_scheduled_work();

outside the ldisc_mutex, but the commit mentioned above (5c58ceff) added a
new case where we do it (it used to be in just tty_set_ldisc() and in
tty_ldisc_release()). So it's a pre-existing issue that probably just got
_way_ easier to hit fairly recently.

Quite frankly, the ldisc_mutex problem is not fixable at this stage in
2.6.31, and it's probably not worth worrying about. I'm planning on
revisiting this after releasing 2.6.31 (probably just deciding that the
sane way to fix it is to turn that flush_to_ldisc thing into just a timer,
not a delayed work - which allows us to hold the mutex), but there's no
way I'm doing that before..

If the fix turns out straightforward, we can back-port it through stable.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/