Possible bug introduced in commit 9b84cca

From: Denys Vlasenko
Date: Wed Dec 28 2011 - 13:56:09 EST


Hi Tejun, Oleg,

Apologies if you are already informed about this bug
by people who originally discovered it.

Looks like after commit 9b84cca, waitpid under strace
sometimes returns bogus ECHILD while child does exist.

I did not yet confirm that the bug appeared exactly
at this commit - Åukasz says that.

I confirmed that bug exists on kernels 3.1.6 (in Fedora)
and 3.1.0-rc4 (vanilla).

We have a testcase which spawns N threads, each of them
performs an infinite loop "fork, exit in child, waitpid
in parent for the child". When straced, sometimes waitpid
returns ECHILD. In fact, there is no need to run many threads -
I just saw it happening with single thread on 4-CPU machine
when I ran "strace -otestcase1.LOG -f ./testcase1 1".
This machine uses 3.1.0-rc4.

Please find testcase attached.

Also please find testcase1.LOG attached.
The key part is here:

931 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xf763dbd8) = 1048
1048 exit_group(42) = ?
931 waitpid(1048, <unfinished ...>
1048 +++ exited with 42 +++
931 <... waitpid resumed> 0xf763d3a0, 0) = -1 ECHILD (No child processes)

To complicate matters, this is observed only under development
version of strace. Old (released) versions of strace do not
let ptraced processes to die - they detach from them when
they think they are going to die (such as when they enter _exit()
or receive a "deadly" signal). Which is a aesthetically horrible and
logically buggy (racy) hack, so we are removing it from strace.
Åukasz says that old strace versions (ones which still use the hack)
don't trigger the bug.

For testing, I will send you strace source tree and pre-compiled
strace binary in a separate email. Alternatively, pull latest
strace git and "autoreconf -fvi && ./configure && make" it.

--
vda
#include <errno.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/sysinfo.h>
#include <sys/wait.h>
#include <stdio.h>
#include <pthread.h>

void* worker(void *arg)
{
while (1) {
pid_t p = fork();
if (-1 == p) { /* error */
perror("fork");
_exit(EXIT_FAILURE);
}
if (0 == p) { /* child */
_exit(42);
}
/* parent */
int stat_loc;
int s = waitpid(p, &stat_loc, 0);
if (-1 == s) {
perror("waitpid");
_exit(EXIT_FAILURE);
}
}
}

int main(int argc, char **argv)
{
int pool_size = get_nprocs() * 4;

if (argv[1])
pool_size = atoi(argv[1]);
printf("Poolsize: %d\n", pool_size);

pthread_t thread_id;
int i;
for (i = 0; i != pool_size; ++i) {
if (pthread_create(&thread_id, NULL, worker, NULL) != 0) {
perror("pthread_create");
_exit(EXIT_FAILURE);
}
}

/* Prevent exiting: wait for last thread (forever) */
void *retval;
pthread_join(thread_id, &retval);

return 43;
}

Attachment: testcase1.LOG.bz2
Description: BZip2 compressed data