Re: main thread pthread_exit/sys_exit bug!

From: Oleg Nesterov
Date: Tue Feb 03 2009 - 08:36:20 EST


On 02/02, Kaz Kylheku wrote:
>
> On Mon, Feb 2, 2009 at 12:39 PM, Kaz Kylheku <kkylheku@xxxxxxxxx> wrote:
> > On Mon, Feb 2, 2009 at 12:17 PM, Ulrich Drepper <drepper@xxxxxxxxxx> wrote:
> >> The userlevel context of the
> >> thread is not usable anymore. It will have run all kinds of
> >> destructors. The current behavior is AFAIK that the main thread won't
> >> react to any signal anymore. That is absolutely required.
> >
> > Hey Ulrich,
> >
> > Thanks for articulating that requirement. I think it can be met by
> > extending the patch a little bit.
>
> I've now done that.
>
> The exiting thread leader, if there are still other
> threads alive, gets its own private signal handler array in which
> every action is set to SIG_IGN, using the ignore_signals
> function.
>
> I experimented with blocking signals, but that approach
> breaks the test case of being able to attach GDB to the
> exiting thread.
>
> As part of the patch, I found it convenient to extend the
> incomplete sys_unshare functionality w.r.t. signal handlers,
> rather than reinvent the wheel.

This is wrong, we can not and must not unshare ->sighand.

> Cheers ...
>
> http://sourceware.org/bugzilla/attachment.cgi?id=3702
> http://sourceware.org/bugzilla/attachment.cgi?id=3705

This adds multiple problems. Just for example, fs/proc/ takes
leader->sighand->siglock to protect the list of sub-threads.
Of course this doesn't work any longer after unsharing. And
there are numerous similar problems.

ignore_signals() in do_leader_exit() is not right too. This
thread group should hangle the group-wide signals even if
the main thread exits.

atomic_read(&sigh->count) in unshare_sighand() is racy, and
in fact bogus. (yes, the whole unshare_sighand() is bogus,
it never populates new_sighp).

The changing of ->sighand in do_unshare() is very wrong, we
can free the sighand_struct which is currently locked/used/etc.


Kaz, I don't really understand why you are trying to add these
complications to the kernel :(

If the thread exits - it should exit. Yes, we have problems
with the exited main thread, we should fix them.

Yes, gdb refuses to attach to the dead thread (I didn't check
this myself, but I think you are right). But there is nothing
wrong here, because we can't ptrace this thread. But, gdb
_can_ ptrace the process, and it can see it have other threads.

OK, if nothing else. Let's suppose your patch is correct. What
about robust futexes? How can we delay exit_robust_list() ?
I don't think we can.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/