2.1.x -- premature SIGIO for socket write()'s?

Arup Mukherjee (arup+@cmu.edu)
Fri, 7 Aug 1998 11:45:27 -0400


--LDbcBtlwCo
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Summary:

Attached is a small java program showing that large writes to
sockets often block forever. This problem appears running
under multiple independent jdk ports (1.1.3-1.1.6) when run under
at least 2.1.65 and 2.1.114, but not run under 2.0.x kernels. A
larger C program, also attached, demonstrates what we believe is
the cause of the problem under java, and seems to be a kernel bug
in 2.1.114.

More Details:

An strace of the jdk running the java code shows writes to
the socket occurring fine until the write system call fails with
EAGAIN. At that point, we believe that the java runtime waits for a
SIGIO (which is delivered). Upon receipt of the SIGIO, the runtime
calls select() to find out which descriptor is now ready for
writing. It looks like the select call returns without showing that
the socket is once again ready for writing.

Although we aren't certain that this is the cause of the
problem under java, we have established that it is possible, under
2.1.114, for SIGIO to get delivered before select will show the
file descriptor ready for writing. The attached C program
demonstrates this. We also don't see any such behavior under 2.0
kernels. Finally, if the select call is delayed sufficiently (we're
not sure how long, but sometimes it takes more than half a second),
it WILL eventually show the descriptor ready for writing. We're
guessing this behaviour is unintentional. Could someone confirm
that, and hopefully put it on the fix list for 2.2?

Since neither of us is on the linux kernel mailing list,
please CC us personally on any replies.

Thanks,

-Arup Mukherjee (arup@cmu.edu) and Darrell Kindred (dkindred@cmu.edu)

--LDbcBtlwCo
Content-Type: text/plain
Content-Description: WriteTest.java
Content-Disposition: inline;
filename="WriteTest.java"
Content-Transfer-Encoding: 7bit

import java.net.*;
import java.io.*;

public class WriteTest {

static final int bufsz = 1024 * 1024;

public static void main (String args[])
{
Socket s;

byte buf[] = new byte[bufsz];

for (int i = 0; i < bufsz; ++i)
buf[i] = (byte) (i % 256);

try {
s = new Socket ("localhost", 9);
OutputStream os = s.getOutputStream();

while (true) {
os.write(buf);
System.out.print (".");
System.out.flush();
}

} catch (Exception e) {
e.printStackTrace(System.err);
}
}
}



--LDbcBtlwCo
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

--LDbcBtlwCo
Content-Type: text/plain
Content-Description: C code to show SIGIO/select problem
Content-Disposition: inline;
filename="dk.c"
Content-Transfer-Encoding: 7bit

#include <stdio.h>
#include <sys/types.h>
#include <sys/param.h>
#include <signal.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <netinet/in.h>
#include <netdb.h>
#include <netinet/tcp.h>
#include <sys/types.h>
#include <fcntl.h>
#include <sys/file.h>
#include <sys/errno.h>
#include <unistd.h>
#include <string.h>

#define WRITELEN (32*1024)
int got_sigio;
int errno;

void
wait_for_sigio (int s)
{
int rc;
fd_set wfds, xfds;
struct timeval tv;

while (!got_sigio)
pause();

tv.tv_sec = 0;
tv.tv_usec = 0;

FD_ZERO (&wfds);
FD_SET (s, &wfds);
/* You can see the same problem with a longer timeout if some other
file descriptor is ready, causing select to return quickly:
tv.tv_sec = 5;
FD_SET (fileno(stdout), &wfds);
*/
FD_ZERO (&xfds);
FD_SET (s, &xfds);

rc = select (s+1, NULL, &wfds, &xfds, &tv);
if (!FD_ISSET (s, &wfds) && !FD_ISSET (s, &xfds))
{
fprintf (stderr, "\nWhoa! fd %d not writable after SIGIO (select returned %d)\n",
s, rc);
exit(1);
}
}

void
sigio_handler (int sig)
{
got_sigio++;
}

int
make_connection (char *portname)
{
struct sockaddr_in name;
struct hostent *hent;
struct servent *serv;
int one = 1;
int s;

if ((s = socket(PF_INET,SOCK_STREAM,0)) == -1)
{
perror("socket");
exit(1);
}
if (setsockopt(s, getprotobyname("tcp")->p_proto,
TCP_NODELAY,(char *)&one,sizeof(one)) == -1)
{
perror("setsockopt");
exit(1);
}

if ((serv = getservbyname(portname, "tcp")) == NULL)
{
perror("getservbyname");
exit(1);
}

memset(&name,0,sizeof(name));
name.sin_family = AF_INET;
name.sin_port = serv->s_port;

if ((hent = gethostbyname("localhost")) == NULL)
{
perror("hent");
exit(1);
}
memcpy(&name.sin_addr, hent->h_addr_list[0], sizeof(name.sin_addr));

if (connect(s,(struct sockaddr *) &name,sizeof(name)) == -1)
{
fprintf(stderr, "connection to localhost:%s failed\n", serv->s_name);
perror("connect");
exit(1);
}

if (fcntl(s, F_SETFL, O_RDWR|O_NONBLOCK|FASYNC) < 0)
perror("fcntl1");
if (fcntl(s, F_SETOWN, getpid()) < 0)
perror("fcntl2");

return s;
}

int
main(int argc, char *argv[])
{
int sock = make_connection(argc > 1 ? argv[1] : "discard");

/* install SIGIO handler */
{
struct sigaction sa;

sa.sa_handler = sigio_handler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sigaction(SIGIO,&sa,NULL);
}

{
char buf[WRITELEN];
int cols = 0;

while (1) {
int rc;
got_sigio = 0;
if ((rc = write(sock,buf,WRITELEN)) < 0)
{
if (errno == EAGAIN)
{
fprintf(stderr, "B");
wait_for_sigio(sock);
}
else
{
perror("write");
exit(1);
}
}
else
fprintf(stderr,".");

if (cols++ % 70 == 69)
fprintf(stderr,"\n");
}
}
}

--LDbcBtlwCo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html