Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)

From: Oleg Nesterov
Date: Tue Mar 17 2009 - 00:18:25 EST


(see http://marc.info/?t=123704955800002)

First of all, perhaps I missed somethings and this is solvable without
kernel changes, but I can't see how.

To simplify, suppose that the application wants to log the faulting
instruction before exit, so it installs the handler for SIGSEGV

void sigsegv_handler(int sig, siginfo_t *info, struct ucontext *context)
{
fprintf(stderr, "bug at %p\n", context->uc_mcontext->ip);
exit(1);
}

But this doesn't work. It is possible that, before the task dequeues SIGSEGV,
another private signal, say, SIGHUP (note that SIGHUP < SIGSEGV) is sent to
this app.

In this case the task dequeues both signals, SIGHUP and then SIGSEGV before
return to user-space. This is correct, but now uc_mcontext->ip points to
sighup_handler, not to the faulted instruction.


Can/should we change force_sig_info_fault() to help?

We can add siginfo_t->_sigfault.ip to solve this problem. This shouldn't
enlarge the size of siginfo_t. With this change sigsegv_handler() always
knows the address of the instruction which caused the fault.


But this doesn't allow to change uc_mcontext->ip to, say, skip the offending
instruction or call another function which does a fixup. Actually, ->si_ip
helps, we can do

void sigsegv_handler(int sig, siginfo_t *info, struct ucontext *context)
{
if (info->si_ip != context->uc_mcontext->ip)
/*
* The offending instruction will be repeated, and
* we will have the "same" SIGSEGV again.
*/
return;

context->uc_mcontext->ip = fixup;
...
}

But this doesn't look very nice. So, perhaps we can do another change?

--- arch/x86/mm/fault.c
+++ arch/x86/mm/fault.c
@@ -177,6 +177,13 @@ static void force_sig_info_fault(int si_
{
siginfo_t info;

+ current->saved_sigmask = current->blocked;
+ spin_lock_irq(&current->sighand->siglock);
+ siginitsetinv(&current->blocked, sigmask(si_signo) |
+ sigmask(SIGKILL) | sigmask(SIGSTOP));
+ spin_unlock_irq(&current->sighand->siglock);
+ set_restore_sigmask();
+
info.si_signo = si_signo;
info.si_errno = 0;
info.si_code = si_code;

But this is a user-visible change, all signals will be blocked until
sigsegv_handler() returns. But with this change sigsegv_handler()
always has the "correct" rt_sigframe.


Comments?

And once again, I have a nasty feeling I missed something and we don't
need to change the kernel.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/