Re: A peculiarity in ptrace/waitpid behavior

From: Oleg Nesterov
Date: Sun Mar 22 2015 - 11:35:50 EST


On 03/21, Oleg Nesterov wrote:
>
> On 03/20, Pavel Labath wrote:
> >
> > One difference I see though is that in
> > our test, we are not sending any additional signals to the thread in
> > question (at least we shouldn't be sending them, but we are sending some
> > signals to other threads in the same process). Do you think it could still
> > be the same issue?
>
> Not sure...
>
> And. I found another race, which looks more promising wrt your description.
> ptrace_resume() sets ->exit_code before it wakes the tracee up. If the
> tracer's sub-thread calls wait() right after that, it can wrongly see
> task_stopped_code(tracee, true) != 0, as if the tracee reports its
> ->exit_code.
>
> > I would be happy to test your patch. I don't think I can patch the kernel
> > on my work machine directly, but I think I might be able to set up some
> > sort of a test environment to try it out.
>
> Thanks! could you try the patch below? It won't help my test-case, but
> _perhaps_ it can fix the problem you hit?
>
> And a couple of questions just in case...
>
> Which kernel version? Although probably this doesn't matter, this race
> is very-very old.
>
> Let me return to your description,
>
> 1) we get a waitpid() notification that the tracee got SIGUSR1
> 2) we do a ptrace(GETSIGINFO) to get more info
> 3) eventually we decide to restart the tracee with PTRACE_CONT, passing it
> SIGUSR1
> 4) immediately after that we get another waitpid notification, again with
> SIGUSR1,
>
> Does this "waitpid notification" mean that _another_ thread returns
> from waitpid() ?
>
> And status == (SIGUSR1 << 8) | 0x7f , yes? IOW, is WIFSTOPPED() true?


Please see the updated (comments + optimization) patch below. Note that
it has the test-case, so I'll send it in any case.

But it would be nice to know if it fixes your problem or not.

Josh, this reminds me the obsure bug report we discussed some time ago,
perhaps this is the same thing...

Oleg.

------------------------------------------------------------------------------
Subject: [PATCH] ptrace: fix race between ptrace_resume() and wait_task_stopped()

ptrace_resume() is called when the tracee is still __TASK_TRACED. We set
tracee->exit_code and then wake_up_state() changes tracee->state. If the
tracer's sub-thread does wait() in between, task_stopped_code(ptrace => T)
wrongly looks like another report from tracee.

This confuses debugger, and since wait_task_stopped() clears ->exit_code
the tracee can miss a signal.

Test-case:

#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <pthread.h>
#include <assert.h>

int pid;

void *waiter(void *arg)
{
int stat;

for (;;) {
assert(pid == wait(&stat));
assert(WIFSTOPPED(stat));
if (WSTOPSIG(stat) == SIGHUP)
continue;

assert(WSTOPSIG(stat) == SIGCONT);
printf("ERR! extra/wrong report:%x\n", stat);
}
}

int main(void)
{
pthread_t thread;

pid = fork();
if (!pid) {
assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
for (;;)
kill(getpid(), SIGHUP);
}

assert(pthread_create(&thread, NULL, waiter, NULL) == 0);

for (;;)
ptrace(PTRACE_CONT, pid, 0, SIGCONT);

return 0;
}

Note for stable: the bug is very old, but without 9899d11f6544 "ptrace:
ensure arch_ptrace/ptrace_request can never race with SIGKILL" the fix
should use lock_task_sighand(child).

Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
Reported-by: Pavel Labath <labath@xxxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>
---
kernel/ptrace.c | 20 ++++++++++++++++++++
1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 1eb9d90..5009263 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -697,6 +697,8 @@ static int ptrace_peek_siginfo(struct task_struct *child,
static int ptrace_resume(struct task_struct *child, long request,
unsigned long data)
{
+ bool need_siglock;
+
if (!valid_signal(data))
return -EIO;

@@ -724,8 +726,26 @@ static int ptrace_resume(struct task_struct *child, long request,
user_disable_single_step(child);
}

+ /*
+ * Change ->exit_code and ->state under siglock to avoid the race
+ * with wait_task_stopped() in between; a non-zero ->exit_code will
+ * wrongly look like another report from tracee.
+ *
+ * Note that we need siglock even if ->exit_code == data and/or this
+ * status was not reported yet, the new status must not be cleared by
+ * wait_task_stopped() after resume.
+ *
+ * If data == 0 we do not care if wait_task_stopped() reports the old
+ * status and clears the code too; this can't race with the tracee, it
+ * takes siglock after resume.
+ */
+ need_siglock = data && !thread_group_empty(current);
+ if (need_siglock)
+ spin_lock_irq(&child->sighand->siglock);
child->exit_code = data;
wake_up_state(child, __TASK_TRACED);
+ if (need_siglock)
+ spin_unlock_irq(&child->sighand->siglock);

return 0;
}
--
1.5.5.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/