Re: Linux 2.2 - BSD/OS 4.1 ARP incompatibility

From: Ricky Beam (jfbeam@bluetopia.net)
Date: Mon Sep 04 2000 - 14:38:32 EST


(OK, I've read enough of this crap.)

On Sat, 2 Sep 2000, David Luyer wrote:
>I'm seeing a problem between Linux 2.2 and BSD/OS 4.1 in the situation on one
>of our backbones.

Why is it the people placed in charge of networks usually have no clue
how they work? (don't answer that.)

[broken network topology deleted]
>Which d.e.f.2 promptly ignores, presumably because the IP stack in BSD/OS
>throws it away at a low level, or possibly simply because BSD/OS has no
>idea where to send the ARP response.

Or both. Both _are_ true.

>Is this already fixed in 2.4 or it is something which needs investigation
>and a patch?

Neither. The problem is your network topology. What you have is something
Cisco should include in their training -- even cisco certified nuts often
cannot figure this out. Every network engineer will eventually lose a toe
on this one.

You have multiple subnets on the same physical network. However, the machines
within those subnets don't know about the other subnets. As long as there
aren't any machines in both subnets -- as you have -- then everything will
function more or less as long as the router ignores a few basic routing rules
("never retransmit a packet onto the interface onwhich it was received.")
The INSTANT you have a machine in (or even aware of) the other subnets,
you see the problem you currently have.

IF (that's a big if) the source is unbound, then the system is free to
choose any "valid" interface address. Logically, it should choose an
address within the dest network. However, I am unaware of any RFC stating
that as a MUST. Technically, ANY address within that physical network is
valid. However, the machines in the other subnets don't know about the
other subnets and thus cannot answer a "martian" arp. (Even if they wanted
to answer, they don't have a physical route.)

If the source is bound, then you're screwed. The system has no choice but
to send the arp with the bound address.

As a general rule, EVERY MACHINE SHOULD BE AWARE OF THE TRAFFIC ON IT'S
CABLE. What you (and everyone else I've ever known) have created is an
unstable network. It's not limited to Linux and BSD. I've seen this crap
between an assortment of OSes.

While a kernel patch might help, it will not prevent this. The kernel cannot
stop you from making bad networks.

As a case study, the ISP I used to work for had, at one point, 11 (yes,
eleven) logic networks on one physical network -- two private networks,
three office networks, the core network, a bridged customer network, the
entire web farm, etc. Some of those networks used to be isolated
physical networks -- and my workstation has an interface in all three. We
constantly had odd problems because no machine (aside from the router)
knew exactly what was where. The helpdesk machines couldn't access the
web farm until I added a route to all the machines (solaris web servers
ad win95 pc's.)

--Ricky

PS: I'm saddened to see the number of router vendors ignoring the specs
    because their customers are building bad networks (read: "idiots")

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Sep 07 2000 - 21:00:19 EST