Re: Problem: shell prompt doesn't return although the invoked program calls _exit().

From: Ishikawa (ishikawa@yk.rim.or.jp)
Date: Thu May 18 2000 - 12:17:17 EST


Ishikawa wrote:

> Hi,
>
> I am writing this message to a few e-mail aliases.
>
> This is because I could not figure out what is the cause of the problem.
>
> Does anyone have an idea what causes this problem?
>
> Observed Platform.
> Debian Gnu/Linux 2.2.14, 2.2.15, 2.2.16pre2
>
> Observed problem.
>
> A particular program, called `prog' in the following,
> invoked like the following manner from shell command line doesn't return
>
> to the shell prompt. It never returns.
>
> ./prog -q < inputfile > outputfile 2>&1
>
> (The program compiled on Solaris does return Solaris 7 for x86.
> It may not mean much, but what puzzled me most is that
> it returns to the shell on Linux back in early April and March if my
> memory serves correctly.)

Someone wrote to me if he can have the source code.
I will check with the management to see if I can make the code available.

Before I forgot,
I have been updating my systems using Debian GNU/Linux's apt-get tool.
This tool fetches the update software packages from ftp/http site(s)
and install them automagically.
This is why I think there have been a subtle change
that would break this particular program in the last 4-6 weeks.
(But maybe I am wrong. But I checked with the older version
of the same program which had worked, if I am not mistaken,
on the early linux configuration. Again the result was hanging
program. Shell prompt didn't return.)

In the meantime, I suspect that SIGCHLD that is generated
when the spawned child
finishes (exits) is not passed correctly to the invoking process.
Shell would have installed the signal handler for this when the
child process is exec'ed.
(Again, this probably never happens unless some
other system calls are involved: ioctl against tty ports,
large number of nanosleep(), time(), read() that would return
immediately if there is no data, etc...)

I have noticed that the semantics of SIGCHLD propagation
differ in a subtle manner between different OSs.
(The below quote is from OpenSSH mailing list.: OpenSSH is
a project to develop open/free source replacement of
commercial SSH (Secure Shell) that is distributed under
restrictive license.) Has there been changes in
the glibc and other places where the signal handling mechanism changes
lately?

from open ssh mailing list:
>I had the same problem with AIX. It seems that the SIGCHLD handler
immediately re-calls itself unless the dead process is first reaped.

This fixes it for me:
--- serverloop.c.orig Wed May 10 14:34:00 2000
+++ serverloop.c Thu May 11 08:17:17 2000
@@ -85,7 +85,6 @@
        int save_errno = errno;
        debug("Received SIGCHLD.");
        child_terminated = 1;
- signal(SIGCHLD, sigchld_handler2);
        errno = save_errno;
 }

@@ -640,6 +639,7 @@
                        while ((pid = waitpid(-1, &status, WNOHANG)) > 0)
                                session_close_by_pid(pid, status);
                        child_terminated = 0;
+ signal(SIGCHLD, sigchld_handler2);
                }
                channel_after_select(&readset, &writeset);
                process_input(&readset);

Can it be that on linux (latest? glibc) the handling of SIGCHLD
differs?

ishikawa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue May 23 2000 - 21:00:15 EST