Re: 2.6.14-rc1 wait()/SIG_CHILD bevahiour

From: Kyle Moffett
Date: Mon Sep 19 2005 - 13:01:58 EST


On Sep 19, 2005, at 13:39:33, Badari Pulavarty wrote:
Hi,

I am looking at a problem where the parent process doesn't seem to cleanup the exited children (with a webserver). We narrowed it down to a simple testcase. Seems more like a lost SIG_CHILD.

You don't get one SIG_CHLD per child that quits. The kernel may and probably will merge SIG_CHLD signals together if it has several queued before it gets a chance to deliver them to your process. This is true of _all_ signals. If you "kill -STOP 1234", then "kill -QUIT 1234", "kill -QUIT 1234", "kill -QUIT 1234", "kill -CONT 1234", the PID 1234 will have 3 signals delivered: The original untrappable SIGSTOP, the SIGCONT that causes it to resume, and a single SIGQUIT immediately following it. The correct and portable way to handle this is to put a loop in your SIGCHLD signal handler:

#include <sys/types.h>
#include <sys/wait.h>

void sigchld_handler(int signal) {
pid_t pid;
int status;

while( -1 != (child = waitpid(-1, &status, WNOHANG)) ) {
/*
* Now "status" is the exit status of the child and
* "pid" is its pid. See the waitpid() manpage for
* macros you can use to get information from the
* status variable.
*/
do_some_processing_of_exited_child(pid,status);
}
}

Cheers,
Kyle Moffett

--
There is no way to make Linux robust with unreliable memory subsystems, sorry. It would be like trying to make a human more robust with an unreliable O2 supply. Memory just has to work.
-- Andi Kleen


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/