[no patch] broken use of mm_release / deactivate_mm

From: Andries Brouwer
Date: Mon Sep 13 2004 - 14:09:07 EST


Recent kernels have a bug in fork(). Things can be improved a bit
by commenting out the lines

/* Get rid of any cached register state */
deactivate_mm(tsk, mm);

in fork.c:mm_release().

What happens at a fork, is that a long sequence of things is done,
and if a failure occurs all previous things are undone. Thus
(in copy_process()):

if ((retval = copy_mm(clone_flags, p)))
goto bad_fork_cleanup_signal;
if ((retval = copy_namespace(clone_flags, p)))
goto bad_fork_cleanup_mm;
retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
if (retval)
goto bad_fork_cleanup_namespace;

...

bad_fork_cleanup_namespace:
exit_namespace(p);
bad_fork_cleanup_mm:
exit_mm(p);
if (p->active_mm)
mmdrop(p->active_mm);
bad_fork_cleanup_signal:
...

Thus, we may do exit_mm() when an attempted fork fails.
The argument of exit_mm() is this new, not completeley initialized
task_struct.

Now exit.c:exit_mm(p) does mm_release(p,p->mm), and this
mm_release() does deactivate_mm(), a macro that clears %fs and %gs.
Ach. A segfault is the result.

More is wrong with mm_release(). It examines p->clear_child_tid,
and possibly does put_user(0, tidptr);.
Oops.

In our case p->clear_child_tid had not yet been initialized for
the child, that happens in copy_thread() that we never reached.
So this is the value the parent had.

Also the call
enter_lazy_tlb(mm, current);
seems strange in this context.

It seems to me that the proper action is some reshuffling
of this code. Maybe it is cleanest to separate the cleanup
code for a failed fork entirely from the code for an exiting
process.

Andries
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/