Re: timebomb

E.J. Wilburn (ej@ns1.woodtech.com)
Sun, 23 Jul 1995 03:18:09 -0500 (CDT)


On Sat, 22 Jul 1995, Marcus Nilsson wrote:

> We are an Internet Provider that began with Linux 0.99.8, and we have had
> many crashes since then. However we have stuck with Linux anyway, and
> 1.2.x was a huge improvement i stability. Now I read in a comp.linux.
> group, that prior to 1.2.9, there was a race condition which made linux
> crash when there was high network and serial activity. We are running
> 1.2.10 at the moment, and linux still crashes at least 2 times a week. No
> kernel panics, just a freeze. The only thing you can do is switching the
> consoles, but you can not write anything.
>
> Now, we are having a term.server rlogin'ing in to our box, so we haven't
> any serial activity what so ever. We have one separate linux machine
> acting as a NFS-server, to protect our file systems. The only thing it
> runs is nfs-server, and it never crashes. We have 32 lines in, and at
> evenings we have over 20 users logged in. Some of them are running
> SLIP(not TIA, Linux's native SLIP).
>
> In this case, we naturally have a lot of network activity. Before we had
> NE2000 cards, but we have switched to 3c509-cards(NE2000 crashed many
> times before). If there is a race condition in the 3Com cards, we'd be
> more than happy to switch to a card that works.
>
> As a programmer myself, I know that this kind of problems is the worst.
> "It hangs sometimes". But you did fix something in 1.2.9, as stated
> above, so perhaps you should take a look at it again. Again, we have not
> any serial activity, but a heavy network load(I even linked /etc/password
> and /etc/shadow over NFS, NIS didn't work, it just crashed).
>
> Until then, for us, Linux is still a ticking timebomb.
>
> /Marcus Nilsson, Kuai Scandinavia AB.

We're having the EXACT same problem here running 1.2.11 on an AMD DX4/100
with 16mb non-parity ram, 1.2gb EIDE drive, shuttle mb w/ AMI Bios. 3c509
network card (tp) running ELF and bash 1.14.4. Something you might try
that we've noticed is that all networking services and all currently
running services in the kernel are not infringed upon. We can ping the
locked system, telnet to it, and FTP to it. When we telnet or FTP to it,
it opens the port but doesn't run the daemons, so we never get a full FTP
connection and never get a login prompt. We can also switch VT's at the
console but aren't able to type anything. We have the same thing
happening on another system with the same basic configuration except it's
got a Cyclades cyclom 32ye running pppd 2.2b3 and dip 3.3.7n. We FINALLY
got an error printed out on the console wich I'll post in a seperate e-mail.

-E.J. Wilburn
System Administrator - Woodtech Information Systems, Inc.
ej@woodtech.com