Re: Kernel Routing Issues/Packett loss (argh.)

From: J. Scott Kasten (jsk@titan.tetracon-eng.net)
Date: Mon Feb 21 2000 - 09:49:16 EST


I got this post off linux-kernel. That's not really the right list
for this post. (Just thought I'd mention that in case the flames come...)

Anyway, in answer to your problem, I can think of a couple factors that
may be comming into play.

Some cable modem/dsl devices apparently capture the MAC address of the
first ethernet frame they see on the local network and then refuse to
respond to anything else until they are power cycled. A family member
has one that behaves that way, and I've seen posts on other lists about
that. As it turns out, many of them have a built in DHCP server for
one address. DHCP of course is tied tightly to the MAC address of your
card, and thus explains why some units behave the way they do. Now
other units don't have a DHCP server built in, but a few still seem to
be picky like that for reasons that I cannot fully explain. It may
have to do with how they filter traffic, or a means to prevent you from
running multiple machines behind your connection.

Suggestion #1. Every time you switch machines, power cycle the cable
box to clear the registered MAC address. Every NIC card has a unique
address embedded in it's little eprom.

The second thing that comes to mind is that there are various itterations
of DHCP in existence. There are two official DHCP specs, RFC 1541 and
RFC 2131. The 2131 obsoletes the 1541 spec. A server and client can be
written towards either spec. Although servers and clients of different
revisions SHOULD interoperate, my own experience indicates that there can
be problems because the 1541 spec left enough grey areas that were open
to interpretation that it sometimes breaks when talking to a 2131 version.

Not only that, but implementations of DHCP clients and servers sometimes
behave flaky in specific combinations, RFC differences asside.

Suggestion #2. Hang another box in the loop somewhere and use ethereal
or some other network sniffer and watch the packets as your windows box
and your linux box attempt to get and establish addresses. See how the
conversations differ.

Suggestion #3. I'm willing to bet that Slakware and RH have different
dhcp server/client software.

Suggestion #4. Try suggestions 1 through 3 in various combinations until
you find the problem or find something that works. This one sounds like
a lot of fun....

Suggestion #5. Let windows get your DHCP address, etc... then switch to
linux and in your network config, set the HWADDRESS of the ethernet device
to match the MAC of your windows box. Then see if linux runs ok that way.
(You of course will make things on your network very unhappy if the windows
box is still running. This is just intended to be a test, not an operational
scenario.)

I hope this helps get you closer to your answer...

On Sun, Feb 20, 2000 at 10:28:53PM -0700, Andrew J. Feldhacker wrote:
> Ok... I've been pounding my head on my desk for the last 12hrs. because of this:
>
> I've been a subscriber to the cox@home cable modem service for the last 9 months, problem-free.... I'm using a DEC Alpha running Debian 2.1 and Kernel 2.2.14 for IP forwarding/masq. for my LAN...
>
> I just moved to a new apartment, and I've gotten my cable service transfered over to the new residence... I was given a different cable modem (was using a Lancity model, now a surfboard sb1300), and a new set of IPs for my host, subnet, etc.
>
> So, I plugged in the new IPs to my /etc/init.d/network script in Debian, re-ran it, and tried to ping an external IP... 100% packet loss. Rebooted and tried pinging again, 100% packet loss....
>
> so I do...
> # ifconfig eth0 down
> # ifconfig eth0 xx.xx.xxx.xx netmask xxx.xxx.xxx.xxx broadcast xxx.xxx.xxx.xxx
> # route add default gw xx.xx.xxx.x netmask 0.0.0.0 metric 1 eth0
>
> Checked over my routing table, verified my settings, etc... they all seem fine, but still 100% packet loss... while the system is sending the packets out, the activity lights for the NIC blink, but still, nothing comes back.
>
> I figured it was time for a second opinion, so I hooked the cable modem up to an i386 based system running Slackware 7.0 (Kernel 2.2.13)... edited my /etc/rc.d/rc.inet1 to reflect the changes to my IPs and gateway, re-ran the script, and tried to ping... 100% packet loss... rebooted.... 100% packet loss.
>
> Tried it in windows, be it by hard-coding or DHCP, everything is cool... bear in mind that all the while, both linux systems work fine on my LAN.
>
> Rather frustrated, I re-install Debian on the Alpha... which, because it comes with it by default, is now running Kernel 2.0.36... same thing... this is even after inputing the network info into the Debian setup program... 100% packet loss.
>
> Even more frustrated, I start out a Slackware install on a spare partition that I have on my workstation... I tell it to use DHCP... upon booting up, dhcpcd times out waiting for the request to be answered. So I hard code it.... 100% packet loss.
>
> Alright, I'm last ditch now... I install RedHat 6.1 on the spart partition... it tries for DHCP... and times out. Because I'm a glutton for punishment, I hardcode....
>
> # ifconfig eth0 xx.xx.xxx.xx netmask xxx.xxx.xxx.xxx broadcast xxx.xxx.xxx.xxx
> # route add default gw xx.xx.xxx.x netmask 0.0.0.0 metric 1 eth0
>
> and I ping............ AND GET REPYS, WITH 0% PACKET LOSS!??!?!
>
> The only thing that I can see different here is the kernel version between the 3 distros.... bear in minds, again, that on all systems/installs, my LAN works fine, it's just thru the cable modem that I have problems... I've flushed all IPCHAIN rules, etc., so those aren't a problem, and the routing, etc., on the Slackware and Debian installs match up exactly with redhat... but yet, redhat works, and debian/slackware don't! (even more annoyingly, windows works too.)... the only things that have changed from my old residence are:
>
> IPs (IP address, subnet, gateway, netmask, broadcast)
> Cable Modem (went from a Lancity to a Surfboard sb3100)
>
> I suspect the cable modem, but NONE OF THIS MAKES ANY SENCE TO ME!!!
>
> I would really prefer running debian as opposed to redhat, so if ANYONE has ANY suggestions, they would be MUCH appreciated.
>
> Thanks VERY much.....
> Andrew Feldhacker
>
> Technology is a word that describes something which doesn't work yet.
> -Douglas Adams

-- 
J. Scott Kasten

jsk AT tetracon-eng DOT net

"That wasn't an attack. It was preemptive retaliation!"

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Feb 23 2000 - 21:00:28 EST