Re: [PATCHv6] exec: Fix a deadlock in ptrace

From: Bernd Edlinger
Date: Thu Mar 05 2020 - 13:37:05 EST


On 3/4/20 10:56 PM, Bernd Edlinger wrote:
> This fixes a deadlock in the tracer when tracing a multi-threaded
> application that calls execve while more than one thread are running.
>
> I observed that when running strace on the gcc test suite, it always
> blocks after a while, when expect calls execve, because other threads
> have to be terminated. They send ptrace events, but the strace is no
> longer able to respond, since it is blocked in vm_access.
>
> The deadlock is always happening when strace needs to access the
> tracees process mmap, while another thread in the tracee starts to
> execve a child process, but that cannot continue until the
> PTRACE_EVENT_EXIT is handled and the WIFEXITED event is received:
>
> strace D 0 30614 30584 0x00000000
> Call Trace:
> __schedule+0x3ce/0x6e0
> schedule+0x5c/0xd0
> schedule_preempt_disabled+0x15/0x20
> __mutex_lock.isra.13+0x1ec/0x520
> __mutex_lock_killable_slowpath+0x13/0x20
> mutex_lock_killable+0x28/0x30
> mm_access+0x27/0xa0
> process_vm_rw_core.isra.3+0xff/0x550
> process_vm_rw+0xdd/0xf0
> __x64_sys_process_vm_readv+0x31/0x40
> do_syscall_64+0x64/0x220
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> expect D 0 31933 30876 0x80004003
> Call Trace:
> __schedule+0x3ce/0x6e0
> schedule+0x5c/0xd0
> flush_old_exec+0xc4/0x770
> load_elf_binary+0x35a/0x16c0
> search_binary_handler+0x97/0x1d0
> __do_execve_file.isra.40+0x5d4/0x8a0
> __x64_sys_execve+0x49/0x60
> do_syscall_64+0x64/0x220
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> The proposed solution is to detect if a sibling thread
> exists that is traced and in this case to make PTRACE_ACCESS
> fail with -EAGAIN instead of dead-lock.
> But other functions like vm_access are allowed to complete normally.
>
> This changes the lifetime of the cred_guard_mutex lock to be
> from flush_old_exec() through install_exec_creds().
> Before, cred_guard_mutex was held from prepare_bprm_creds() through
> install_exec_creds().
>
> Additionally a new mutex exec_guard_mutex is introduced that is used
> for PTRACE_ACCESS and SECCOMP_FILTER_FLAG_TSYNC.
>
> Signed-off-by: Bernd Edlinger <bernd.edlinger@xxxxxxxxxx>
> ---
> Documentation/security/credentials.rst | 29 ++++++++---
> fs/exec.c | 58 ++++++++++++++++++---
> include/linux/binfmts.h | 15 +++++-
> include/linux/sched/signal.h | 10 ++--
> init/init_task.c | 1 +
> kernel/cred.c | 4 +-
> kernel/fork.c | 1 +
> kernel/ptrace.c | 20 ++++++--
> kernel/seccomp.c | 15 +++---
> mm/process_vm_access.c | 2 +-
> tools/testing/selftests/ptrace/Makefile | 4 +-
> tools/testing/selftests/ptrace/vmaccess.c | 85 +++++++++++++++++++++++++++++++
> 12 files changed, 210 insertions(+), 34 deletions(-)
> create mode 100644 tools/testing/selftests/ptrace/vmaccess.c
>

Okay, I think there is consensus about the next steps to be as follows:

- post the Documentation/security/credentials.rst changes as an independent patch.
- post a infrastructure patch which only introduces two new mutexes,
one exec_guard_mutex, and one the "cred_change_mutex" (I am unhappy with that name,
because credentials can change without the cred_guard_mutex, this appears more
to guarantee that the credentials of the process and the process memory map are
consistent, so I think I need to think of a better name first...)
This keeps cred_guard_mutex as is, just deprecates it, and adds a note that it will
go away.
- post one patch that fixes the mm_access code path
- post one patch that fixes the PTRACE_ATTACH code path
- post one patch that introduces the new test cases


Thanks
Bernd.