Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops

From: Ingo Molnar
Date: Mon Sep 13 2004 - 03:31:47 EST



* Kirill Korotaev <dev@xxxxx> wrote:

> This patch removes sighand checks from the next_thread(), since they
> are incorrect and has nothing to do with the next_thread() function.
> So they could trigger BUG() when there were no actually bug at all.

the problem is, generally it is not valid to have a thread on the thread
list that has no ->sighand structure. This is what happens when we exit
a task:

write_lock_irq(&tasklist_lock);
...
__exit_sighand(p);
__unhash_process(p);

the BUG() is useful for all the code that uses next_thread() - you can
only do a safe next_thread() iteration if you've locked ->sighand.

there's one exception: in the procfs code we can get a reference to
almost-dead tasks as well that are not even in the tasklist. (This is a
relatively new thing introduced by me that can happen due to the
preemptability of some of the exit path.)

so i believe your fix papers over the real bug which is the use of an
almost-dead task for thread iterations. Since we've already done
__unhash_process() not doing the BUG introduces a more subtle bug: the
use of the stale PID pointers! So i believe the right fix is the one
below, which (under the safety of read_lock(tasklock)) checks for the
availability of task->sighand - and skips the thread iterations if so.

Ingo

--- linux/fs/proc/array.c.orig
+++ linux/fs/proc/array.c
@@ -356,7 +356,7 @@ static int do_task_stat(struct task_stru
stime = task->signal->stime;
}
}
- if (whole) {
+ if (whole && task->sighand) {
t = task;
do {
min_flt += t->min_flt;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/