ptrace(PTRACE_ATTACH) [no intervering wait] ptrace(PTRACE_DETACH)may leave tracee stuck

From: Mike Galbraith
Date: Tue Jul 23 2013 - 06:05:27 EST


I received a report that glibc:elf/pldd hangs occasionally, and indeed..

for i in `seq 1 1000`; do taskset -c 3 pldd $$ > /dev/null 2>&1; done

..will do so. Rummage.....

ptrace(PTRACE_DETACH) returns -ESRCH when the trap hasn't happened yet,
which happens because pldd doesn't wait() before ptrace(PTRACE_DETACH).

pldd source:

if (ptrace (PTRACE_ATTACH, tid, NULL, NULL) != 0)
{
/* There might be a race between reading the directory and
threads terminating. Ignore errors attaching to unknown
threads unless this is the main thread. */
if (errno == ESRCH && tid != pid)
continue;

error (EXIT_FAILURE, errno, gettext ("cannot attach to process %lu"),
tid);
}

struct thread_list *newp = alloca (sizeof (*newp));
newp->tid = tid;
newp->next = thread_list;
thread_list = newp;
}

closedir (dir);

int status = get_process_info (dfd, pid);

assert (thread_list != NULL);
do
{
ptrace (PTRACE_DETACH, thread_list->tid, NULL, NULL);
thread_list = thread_list->next;
}
while (thread_list != NULL);

Seems this usually works only because cycles expended between attach and
detach is usually enough to let trap happen so tracee can set its state
to TASK_TRACED as PTRACE_DETACH expects it to be.

Is this expected behavior? It looks a bit like "Doctor Doctor..".

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/