Re: [PATCHSET RFC] ptrace,signal: clean transition between STOPPEDand TRACED

From: Jan Kratochvil
Date: Wed Jan 12 2011 - 16:44:13 EST


On Fri, 24 Dec 2010 15:00:50 +0100, Tejun Heo wrote:
> 1. When attaching to a STOPPED task or a traced task stops for group
> stop, the tracee now enters TRACED instead of STOPPED. This is
> visible via fs/proc but, more importantly, SIGCONT is ignored if a
> task is TRACED.
>
> The behavior before the change was quite erratic. The first ptrace
> operation after the tracee enters STOPPED would silently transit
> its state to TRACED behind its back bypassing arch_ptrace_stop().
> This means that SIGCONT is honored until the first following ptrace
> operation but ignored after that.
>
> This may, for example, affect the operation of strace but given how
> strace always need to issue further ptrace operations on trap to
> determine what's going on, I doubt it would actually be worse.

FSF GDB for `T (stopped)' processes currently does:
PTRACE_ATTACH
check /proc/%d/status for `T (stopped)' (by GDB's pid_is_stopped)
if found then kill (PID, SIGSTOP) && ptrace (PTRACE_CONT, PID, 0, 0).
waitpid (pid, &status, 0) - so that this one does not get stuck if the stop
event was already eaten out before.

If the `T (stopped)' will now always FAIL then at leat the waitpid then should
never get stuck.


> 2. The transition between STOPPED and TRACED involves a short window
> of RUNNING inbetween. On attach, the transition is hidden from the
> tracer using GROUP_STOP_TRAPPING but it still is visible to other
> threads in the tracer's group. IOW, if another thread performs
> WNOHANG wait(2) on the tracee while attach is in progress, the
> wait(2) may fail even if the tracee is known to be in stopped state
> before.
>
> The same problem exists the other direction during detach.
> Currently, the code doesn't try to hide this transition even from
> the tracer. IOW, if the tracer attaches to a stopped task,
> detaches, reattaches and then performs WNOHANG wait(2), the wait(2)
> may fail. However, given the previous behavior where the tracee is
> always woken up by wake_up_process() on detach, this is highly
> unlikely to cause any problem.

FSF gdbserver --multi does PTRACE_ATTACH followed by waitpid (WNOHANG) and it
fails if it returns ECHILD on the first try.

ptrace(PTRACE_ATTACH, 22049, 0, 0) = 0
wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], WNOHANG, NULL) = 22049

It may be also a gdbserver bug, though.


Thanks,
Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/