Re: [PATCH] seccomp: suppress fatal signals that will never be delivered before seccomp forces an exit because of said signals

From: Kees Cook
Date: Wed Aug 10 2016 - 17:35:34 EST


On Wed, Aug 10, 2016 at 12:50 PM, Kyle Huey <me@xxxxxxxxxxxx> wrote:
> This fixes rr. It doesn't quite fix the provided testcase, because the testcase fails to wait on the tracee after awakening from the nanosleep. Instead the testcase immediately does a PTHREAD_CONT, discarding the PTHREAD_EVENT_EXIT. The slightly modified testcase at https://gist.github.com/khuey/3c43ac247c72cef8c956c does pass.
>
> I don't see any obvious way to dequeue only the fatal signal, so instead I dequeue them all. Since none of these signals will ever be delivered it shouldn't affect the executing task.
>
> Suggested-by: Robert O'Callahan <robert@xxxxxxxxxxxxx>
> Signed-off-by: Kyle Huey <khuey@xxxxxxxxxxxx>

Oh excellent! Thank you for the patch; I've been chasing other things
this week and hadn't had time yet to dig into this. I agree with your
rationale: the signals aren't being delivered anyway, so drop them to
keep ptrace operating as expected.

Oleg, does this pass a quick sanity-check from you? For context, the
original problem description was:


The problem is that if a tracee task is in a PTRACE_EVENT_SECCOMP
trap, or has been resumed after such a trap but not yet been
scheduled, and another task in the thread-group calls exit_group(),
then the tracee task exits without the ptracer receiving a
PTRACE_EVENT_EXIT notification. Small-ish testcase here:
https://gist.github.com/rocallahan/1344f7d01183c233d08a2c6b93413068.

The bug happens because when __seccomp_filter() detects
fatal_signal_pending(), it calls do_exit() without dequeuing the fatal
signal. When do_exit() sends the PTRACE_EVENT_EXIT notification and
that task is descheduled, __schedule() notices that there is a fatal
signal pending and changes its state from TASK_TRACED to TASK_RUNNING.
That prevents the ptracer's waitpid() from returning the ptrace event.
A more detailed analysis is here:
https://github.com/mozilla/rr/issues/1762#issuecomment-237396255.


Thanks!

-Kees

> ---
> kernel/seccomp.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index ef6c6c3..728074d 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -609,8 +609,20 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
> * Terminating the task now avoids executing a system
> * call that may not be intended.
> */
> - if (fatal_signal_pending(current))
> + if (fatal_signal_pending(current)) {
> + /*
> + * Swallow the signals we will never deliver.
> + * If we do not do this, the PTRACE_EVENT_EXIT will
> + * be suppressed by those signals.
> + */
> + siginfo_t info;
> +
> + spin_lock_irq(&current->sighand->siglock);
> + while (dequeue_signal(current, &current->blocked, &info));
> + spin_unlock_irq(&current->sighand->siglock);
> +
> do_exit(SIGSYS);
> + }
> /* Check if the tracer forced the syscall to be skipped. */
> this_syscall = syscall_get_nr(current, task_pt_regs(current));
> if (this_syscall < 0)
> --
> 2.7.4
>



--
Kees Cook
Nexus Security