Re: kswapd deadlock 2.4.3-pre6

From: Linus Torvalds (torvalds@transmeta.com)
Date: Wed Mar 21 2001 - 15:29:55 EST

Next message: Alexander Viro: "Re: spinlock usage - ext2_get_block, lru_list_lock"
Previous message: Kevin Buhr: "Re: Linux 2.4.2 fails to merge mmap areas, 700% slowdown."
In reply to: Mike Galbraith: "kswapd deadlock 2.4.3-pre6"
Next in thread: Linus Torvalds: "Re: kswapd deadlock 2.4.3-pre6"
Reply: Linus Torvalds: "Re: kswapd deadlock 2.4.3-pre6"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 21 Mar 2001, Mike Galbraith wrote:
>
> I have a repeatable deadlock when SMP is enabled on my UP box.
>
> >>EIP; c021e29a <stext_lock+1556/677b> <=====

When you see something like this, please do

gdb vmlinux

(gdb) x/10i 0xc021e29a

and it will basically show you where the code jumps back to.

It's almost certainly the beginning of swap_out_mm() where we get the
page_table_lock, but it would still be good to verify.

The deadlock implies that somebody scheduled with page_table_lock held.
Which would be really bad. You should be able to do something like

if (current->mm && spin_is_locked(&current->mm->page_table_lock))
BUG():

in the scheduler to see if it triggers (this only works on UP hardware
with a SMP kernel - on a real SMP machine it's entirely legal to have the
lock during a schedule, as the lock may be held by any of the _other_
CPU's, of course, and the above assert would be the wrong thing to do in
general)

Of course, it might not be somebody scheduling with a spinlock, it might
just be a recursive lock bug, but that sounds really unlikely.

> ac20+2.4.2-ac20-rwmmap_sem3 does not deadlock doing the same
> churn/burn via make -j30 bzImage.

it won't do the page table locking for page table allocations, so it will
have other bugs, though.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Alexander Viro: "Re: spinlock usage - ext2_get_block, lru_list_lock"
Previous message: Kevin Buhr: "Re: Linux 2.4.2 fails to merge mmap areas, 700% slowdown."
In reply to: Mike Galbraith: "kswapd deadlock 2.4.3-pre6"
Next in thread: Linus Torvalds: "Re: kswapd deadlock 2.4.3-pre6"
Reply: Linus Torvalds: "Re: kswapd deadlock 2.4.3-pre6"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Fri Mar 23 2001 - 21:00:16 EST