Re: [PATCHv5] exec: Fix a deadlock in ptrace

From: Christian Brauner
Date: Tue Mar 03 2020 - 11:51:22 EST


On Tue, Mar 03, 2020 at 09:18:44AM -0600, Eric W. Biederman wrote:
> Bernd Edlinger <bernd.edlinger@xxxxxxxxxx> writes:
>
> > This fixes a deadlock in the tracer when tracing a multi-threaded
> > application that calls execve while more than one thread are running.
> >
> > I observed that when running strace on the gcc test suite, it always
> > blocks after a while, when expect calls execve, because other threads
> > have to be terminated. They send ptrace events, but the strace is no
> > longer able to respond, since it is blocked in vm_access.
> >
> > The deadlock is always happening when strace needs to access the
> > tracees process mmap, while another thread in the tracee starts to
> > execve a child process, but that cannot continue until the
> > PTRACE_EVENT_EXIT is handled and the WIFEXITED event is received:
>
> A couple of things.
>
> Why do we think it is safe to change the behavior exposed to userspace?
> Not the deadlock but all of the times the current code would not
> deadlock?
>
> Especially given that this is a small window it might be hard for people
> to track down and report so we need a strong argument that this won't
> break existing userspace before we just change things.
>
> Usually surveying all of the users of a system call that we can find
> and checking to see if they might be affected by the change in behavior
> is difficult enough that we usually opt for not being lazy and
> preserving the behavior.
>
> This patch is up to two changes in behavior now, that could potentially
> affect a whole array of programs. Adding linux-api so that this change
> in behavior can be documented if/when this change goes through.
>
> If you can split the documentation and test fixes out into separate
> patches that would help reviewing this code, or please make it explicit
> that the your are changing documentation about behavior that is changing
> with this patch.

Agreed. I think it'd be good to do it in three patches:
1. unrelated documentation update
2. fix + documentation changes specific to the fix
3. test(s)

Christian