EPIPE dude?

Vince Weaver (weave@eng.umd.edu)
Mon, 10 Mar 1997 18:40:26 -0500 (EST)


Hello. I am running Linux 2.1.28 (with the patch to fix the dropped
characters on telnetting in) on a Cyrix 486 Dx2-66
with a ne2000 clone network card and 20meg of ram.

I run a talker on my computer (people telnet in and chat) that
is based on NUTS 3.3.2 but I have heavily modified it.

The talker had been running for 4 days and my computer also up 4 days and
so far the kernel has been running fine.

Suddenly I was getting sporadic errors saying
"tcp_do_sendmsg1: EPIPE dude..."
corresponding to times that someone
connected to the talker pressed enter. (When they press enter
the string they have typed is sent to each person connected via
sockets).

This error did not appear every time; just about 50% or so. Here is
the relevant part from syslog.

The "tcp_do_sendmsg1: EPIPE dude" line is from line 820 of
/linux/net/ip4/tcp.c (I found this w creative use of grep)

Mar 10 17:20:15 hal kernel: tcp_do_sendmsg1: EPIPE dude...
Mar 10 17:20:25 hal last message repeated 3 times
Mar 10 17:22:01 hal kernel: tcp_do_sendmsg1: EPIPE dude...
Mar 10 17:22:08 hal last message repeated 3 times
Mar 10 17:24:11 hal last message repeated 2 times
Mar 10 17:25:23 hal last message repeated 12 times
Mar 10 17:25:39 hal last message repeated 8 times
Mar 10 17:28:15 hal kernel: tcp_do_sendmsg1: EPIPE dude...
Mar 10 17:29:16 hal last message repeated 32 times
Mar 10 17:30:03 hal last message repeated 9 times
Mar 10 17:38:06 hal kernel: tcp_do_sendmsg1: EPIPE dude...
Mar 10 17:39:04 hal last message repeated 17 times
Mar 10 17:41:48 hal kernel: tcp_do_sendmsg1: EPIPE dude...
Mar 10 17:42:06 hal last message repeated 9 times
Mar 10 18:11:57 hal kernel: tcp_do_sendmsg1: EPIPE dude...
Mar 10 18:13:03 hal last message repeated 4 times

I attempted to reboot the talker to see if the problem would go away;
it gave an error saying the ports (7000,7001,7002) were still in use.
This never happened before. Netstat displayed the following.

hal:~# netstat
Active Internet connections
Proto Recv-Q Send-Q Local Address Foreign Address (State)
User
tcp 1 0 hal.dorm.umd.edu:7001 hal.dorm.umd.edu:13525 TIME_WAIT
root
tcp 0 0 hal.dorm.umd.edu:13651 z.glue.umd.edu:telnet ESTABLISHED
vince
tcp 1 0 hal.dorm.umd.edu:7000 z.glue.umd.edu:35715 TIME_WAIT
root
Active UNIX domain sockets
Proto RefCnt Flags Type State Path

Then after an additional minute or two it all cleared and I was able
to re-start with no errors.

My question: is this a kernel problem? And if so have I missed a patch?

Thanks for your help.

Vince Weaver
____________
\ /\ /\ / Vince Weaver
\/__\/__\/ weave@eng.umd.edu http://www.glue.umd.edu/~weave