PATCH for 2.3.12: swap-related deadlock

Kevin Buhr (buhr@stat.wisc.edu)
02 Aug 1999 00:50:49 -0500


Linus:

Two things: a patch and a question.

First, if there is no swap space available, "try_to_swap_out()" can
exit without unlocking the "page_table_lock" spinlock. I tickled this
bug during a boot-time "fsck" before swap space had been added.

The following patch (against 2.3.12) fixes this problem:

--- linux-2.3.12/mm/vmscan.c Sun Aug 1 19:01:42 1999
+++ linux-2.3.12-mine/mm/vmscan.c Sun Aug 1 22:42:03 1999
@@ -155,7 +155,7 @@
*/
entry = get_swap_page();
if (!entry)
- goto out_failed; /* No swap space left */
+ goto out_failed_unlock; /* No swap space left */

vma->vm_mm->rss--;
tsk->nswap++;

Second, I originally observed (and debugged) this problem under
2.3.11. When I'd applied the above patch to 2.3.11, the hangs were
replaced by reboots. I narrowed it down to a change in "fs/buffer.c"
introduced in 2.3.11 to switch "bdflush" to a "lazy TLB mode".
Backing out the change stopped the reboots.

I hypothesized a race condition: to enter lazy TLB mode, the
"sys_bdflush()" function set "current->mm" to NULL. If this happened
immediately after "swap_out()" had checked its validity, then
"swap_out()"'s helper functions would dereference a NULL pointer.
Since "swap_out()" was running with the kernel lock held, I added a
"lock_kernel()/unlock_kernel()" pair around the "current->mm = NULL"
statement, and that stopped the reboots.

However, I wasn't able to duplicate the reboot problem under 2.3.12,
so perhaps my hypothesis was incorrect.

What is the purpose of "task->active_mm" versus "task->mm"? Should
"start_lazy_tlb()" perform "current->mm = NULL" while holding the
kernel lock? Or should "swap_out()" and its helper functions be
looking at "tsk->active_mm" instead of "tsk->mm"?

Thanks.

Kevin <buhr@stat.wisc.edu>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/