(off topic?) Interesting TCP network problems

Patrick Main (72254.3077@compuserve.com)
Sat, 02 Aug 1997 02:03:00 -0300


First: this is probably not really Linux but Linux network
gurus may be interested and may run into similar problems.

Second: Copy to Alan to see his comments and would Linux
handle this network.

The Problem <see below for details)
Our campus has had eight Class C addresses 199.78.xxxxx
Some of these are used as 256 nets <labs> and some are subnetted.

Now we are loosing out addresses <change in backbone providers>
and have been given/loaned the equivalent size block delegated
From a CLASS B ie: the "state" is delegating 150.176.(192-199).xxx

We have eight "equivalent Class C blocks" but of course subnetting
is now a little different.

The campus router is a multiport Bay Networks router (about $80,000)
currently using 14 ethernet ports and providing the bulk of the
TCP/IP routing tasks along with being an IPX BRIDGE <one IPX network>

OK FACTS: This router allows dual loading of IP addresses similar to
IPalias i guess. Using this we were able to leave all the old
networks and addresses configured and simply loaded the new addresses
on top. ie one port may have 199.78.xxx.yyy/24 and now has
150.176.xxx.yyy/24 this was tested first then implemented on all ports
and all seemed fine. The stations were then setup with the new addresses
and everybody seemed happy. We also have a few linux servers spread
about and they kept chirping happily along.

Now i want to state i feel quite comfortable discussing addresses
network subnets and masks and so forth. Some questions concerning the
broadcast addresses but this is coming up next along with arp failures.

I think this is wordy but time to point out that we are using the
150.176.xxx networks with different size subnets. This i understand
is called variable subnets and at first seemed to work well.

The First problem: The old addresses were removed from the router with
the exception of the subnet that our internet gateway is on.

At first there were no reported problems but during the next several
MINUTES it seemed that differnet networks went down.
Hind sight now shows that stations on the same subnet could
ping/see each other BUT not the router port which is each subnet's
gateway!!! ie: the router port and the workstations could not talk tcp
also the IPX bridging did continue to work so network cabling itself
seems fine.
Hine sight again suggests that when the router's arp table entries
expired communication was lost with each station. In the library there
were several mine is not working but his still is until one by one
they all stopped working. ie: not a sudden loss of the lab but gradual.
I have not determined if it is the router or the workstations
unable to arp properly but this will be testing on the workstation side
using a linux workstation and experimentation with hand setting arp
entries. DOES anyone want to bet who is at fault. My two cents says not
the Linux work station! :-) actually i would only loose a beer!

Alan the next statements are why i am copying you on this since this
now becomes very weird!

With a router port and all stations setup properly on a 150.176 subnet
the router stops talking to the workstations.
BUT when the only change is to setup a second address/network using
the old 199.78.xxx address and setting the address on the router port
the effected subnet is on would you believe the stations suddenly start
talking to the router again!!!!!

ie: until a 199.78.xxx address and network is configured on the
wellfleet router the workstations are unable to talk to the router
using their 150.176 network

This has been verified on the library network which used to have
a 64 groupsize network with the old 199.78.stuff but it has been
increased in size to a 256 groupsize using a 150.176 network with of
course a subnet mask of 255.255.255.0 old mask was 255.255.255.192

For the router default broadcast address appears as 0.0.0.0 ?????
changing this to the correct 150.176.196.254 (lib net is 150.176.196.0)
has no effect. A lot of the stations are windoze 95 which has an
easy winipcfg command to show the tcp/ip setup and these are correct.

Anyhow i have no idea anymore of why adding an old 199.78.xxx.xxx with
a netmask of 255.255.255.192 to the Library's wellfleet port
would result in the network working? No stations use 199.78.xxx anymore.
AND the broadcast address of the 199... port is the default of 0.0.0.0
Of course with a what the heck i tried 0.0.0.0 as the broadcast using
only a 150.176 address.
Results of all this the library network and wellfleet router will
only see each other if and only if the wellfleet port has an old address
loaded on the port. For grins removing the 150.176.. from the port at
least also killed communications too. I think my life would have ended
it the work stations had been able to talk to the old address only! ;-)

School is now out for two weeks so the testing begins now....
I suspect for some reason the campus router is being confused by the
variable subnets with the 150.176 networks. ie some ports have a
32 and 64 size block while other ports have a "full" 256 block.

Also we have not tried removing all old addresses and rebooting the
router. Afraid that we may lock ourselves into a can not communicate
with the router.

QUESTION? has anyone any ideas? :-)
Second question. Suggestion for replacing the tcp routing duties with
linux boxes. Anyone used the 4 port cogent board? How many networks
could one linux box be expected to route assumng only 10meg ethernet?

Actually only need to route tcp between about 8 or 9 networks.
what about two four port cogent boards in one machine vs four pci
network boards in three machines ie the latter adds
"4" + 4/"3" + 4/"3" for a three machine 10 port router

Could i use old 486 ISA network servers with newer ne2000 clones
as a cheapy solution? 486 33 mhz 16 megs if needed can this route
four networks with NE2000 cards or does this task require better
network cards such as tulip based PCI network cards?

Again sorry for the waste of bandwidth but i have never seen this kind
of tcp networking problem.