Re: [PATCH] seccomp: Fix tracer exit notifications during fatal signals

From: Kees Cook
Date: Thu Aug 11 2016 - 14:06:22 EST


On Thu, Aug 11, 2016 at 12:27 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Wed, Aug 10, 2016 at 4:37 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>> This fixes a ptrace vs fatal pending signals bug as manifested in seccomp
>> now that ptrace was reordered to happen after ptrace. The short version is
>> that seccomp should not attempt to call do_exit() while fatal signals are
>> pending under a tracer. This was needlessly paranoid. Instead, the syscall
>> can just be skipped and normal signal handling, tracer notification, and
>> process death can happen.
>>
>> Slightly edited original bug report:
>>
>> If a tracee task is in a PTRACE_EVENT_SECCOMP trap, or has been resumed
>> after such a trap but not yet been scheduled, and another task in the
>> thread-group calls exit_group(), then the tracee task exits without the
>> ptracer receiving a PTRACE_EVENT_EXIT notification. Test case here:
>> https://gist.github.com/khuey/3c43ac247c72cef8c956ca73281c9be7
>>
>> The bug happens because when __seccomp_filter() detects
>> fatal_signal_pending(), it calls do_exit() without dequeuing the fatal
>> signal. When do_exit() sends the PTRACE_EVENT_EXIT notification and
>> that task is descheduled, __schedule() notices that there is a fatal
>> signal pending and changes its state from TASK_TRACED to TASK_RUNNING.
>> That prevents the ptracer's waitpid() from returning the ptrace event.
>> A more detailed analysis is here:
>> https://github.com/mozilla/rr/issues/1762#issuecomment-237396255.
>>
>> Reported-by: Robert O'Callahan <robert@xxxxxxxxxxxxx>
>> Reported-by: Kyle Huey <khuey@xxxxxxxxxxxx>
>> Fixes: 93e35efb8de4 ("x86/ptrace: run seccomp after ptrace")
>> Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
>> ---
>> kernel/seccomp.c | 12 ++++++++----
>> 1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index ef6c6c3f9d8a..0db7c8a2afe2 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -605,12 +605,16 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>> ptrace_event(PTRACE_EVENT_SECCOMP, data);
>> /*
>> * The delivery of a fatal signal during event
>> - * notification may silently skip tracer notification.
>> - * Terminating the task now avoids executing a system
>> - * call that may not be intended.
>> + * notification may silently skip tracer notification,
>> + * which could leave us with a potentially unmodified
>> + * syscall that the tracer would have liked to have
>> + * changed. Since the process is about to die, we just
>> + * force the syscall to be skipped and let the signal
>> + * kill the process and correctly handle any tracer exit
>> + * notifications.
>> */
>
> What does "The delivery of a fatal signal during event notification
> may silently skip tracer notification" mean? Is that describing
> exactly the issue you're fixing? If so, maybe that sentence should be
> deleted.

Well, it's related. The fatal signal delivery may skip notifications,
but we could deterministically get into the case of skipping
PTRACE_EVENT_EXIT due to the do_exit(), which is much more confusing.
So changing this restores the pre-existing (hard to hit) race.

> Otherwise looks good to me.

Thanks!

-Kees

--
Kees Cook
Nexus Security