Re: Network socket disconnect unreliable

From: Richard B. Johnson (root@chaos.analogic.com)
Date: Tue Feb 22 2000 - 09:19:14 EST


On Sun, 20 Feb 2000, Alan Cox wrote:

> > Yep. Wonderful. When the socket disconnects, select doesn't return!
>
> It does here. Do you have the right fd in your read set ?
>
> > read/write code does as you show, but without the exception mask
> > select() doesn't return after a disconnect. At least not on
> > the 'current' versions of Linux (up to 2.3.41).
>
> The exception mask shouldnt even be applied here. It is for reporting urgent
> data packets.
>

If you telnet to the server, then hang up, everything is fine. I tried
to find out how many 'hits' per second the server could take. I just
connected then closed the socket in a loop. The result was connections
'forever' so it may be some kind of race. The original code, before
I added the exception-flags hack follows. During the fast connect/
disconnect select() does not return so the server sleeps forever.
If I add the exception hack, select does return, however the appropriate
rfds bit is not set so the procedure loops forever, eating CPU time.

The procedure write_all() writes all the data to the fd, returning
zero if everything went okay. It uses write() in a loop, indexing
through the data.

/* -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= */
/*
 * Run the RX/TX binary state-machine until there is a disconnection
 * or any other permanent error.
 */
                sfd = MAX(fd, ma) +1;
                for(;;)
                {
                    FD_ZERO(&mem->rfds);
                    FD_SET(fd, &mem->rfds);
                    FD_SET(ma, &mem->rfds);
                    if(select(sfd, &mem->rfds, NULL, NULL, NULL) <= 0)
                        break;
                    if(FD_ISSET(fd, &mem->rfds))
                    {
                        if((i = read(fd, mem->buf, BUF_LEN)) <= 0)
                            break;
                        if(write_all(ma, mem->buf, i))
                            break;
                    }
                    if(FD_ISSET(ma, &mem->rfds))
                    {
                        if((i = read(ma, mem->buf, BUF_LEN)) <= 0)
                             break;
                        pty_bug = mem->buf;
                        if((i > 1) && (*pty_bug == '\0'))
                        {
                            i--;
                            pty_bug++;
                        }
                        if(write_all(fd, pty_bug, i))
                             break;
                    }
                }
                (void)close(ma);
                (void)close(fd);
                exit(EXIT_SUCCESS);

The code contains a hack as a work-around to the pty bug that was
introduced when Linux changed the ptys. Data read from the master
pty contains a leading ASCII null-byte, even if the user of the
slave pty sets the terminal to raw mode. You don't see the leading
zero on most terminals but it certainly messes up a binary file
transfer using `rz` and/or `sz` with telnet. I really wish this
was fixed also. This was reported just after the addition of the
Unix-98 ptys.

Cheers,
Dick Johnson

Penguin : Linux version 2.3.41 on an i686 machine (800.63 BogoMips).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Feb 23 2000 - 21:00:30 EST