Re: Headerless packets hitting ethernet?

Paul Rusty Russell (Paul.Russell@rustcorp.com.au)
Sun, 26 Sep 1999 17:27:14 +0930


In message <199909241503.TAA27762@ms2.inr.ac.ru> you write:
> BTW you could easily discover this fact looking at source.

I'm not going to make myself popular here, but the network code is
*not* easy to read.

#ifdef RANT
Consider: ip_route_output() returns a route, but you wonder if you
have to free it somehow? You look through ip_route_input, then
ip_route_input_slow, and verify that it's using a reference count.

Looking for _release, _unref etc. gets you nowhere; eventually you
find some code that uses ip_rt_put. Cool, so the networking jargon
for release is xxx_put().

So skb_put releases skbs then? Not fucking likely. kfree_skb does
that: of course, it doesn't actually kfree the skb unless the refcnt
is 0. skb_push and skb_pull have nothing to do with putting skbs on a
stack. skb_realloc_headroom is not really realloc: it returns a copy
of the packet with more headroom, without freeing the old one. Of
course, the definition for `struct sk_buff' is not in `sk_buff.h', but
`skbuff.h'.

Consider tracing the code path of an IP packet from one NIC to another
involves 3 chained strategy functions (unless it's fast routed!):
packet_type's descriptive `func()', the routing code's output(), and
then the queuing disc's function if there is one.

Figuring out the packet_type's func is a simple matter of finding how
they're registered (dev_add_pack of course!) and looking for it in
ipv4/, to see that it's going to be ip_rcv().

Figuring out how the packet gets from ip_finish_output2 to
dev_queue_xmit is something I've NEVER managed to do; grepping for
dev_queue_xmit or hh_output gives nothing I can see, neither does the
desparate resort of grepping for `output'.

But if you manage to get through that, can you figure out that the
queuing discipline on the device has a queue: ethertap.c: doesn't set
it statically, or in init_module, nor register_netdev(), try
ethertap_probe(), no, but it calls net_init.c's ether_setup(), which
doesn't set it either. grep'ing the directory reveals that noone
mentions `qdisc'.

By guessing, I look in sched/sch_generic.c: my, this
dev_init_scheduler() looks like it might be something, and grepping
again finds it.

After all this, there's NO WAY you're going to bet that you didn't
miss something, or get it wrong: trying to figure what fields are
going to be valid in the skb at the end (and which of those is
coincidence which won't be true in the next kernel version) is not
realistic.

#endif /*RANT*/

There are *reasons* why none of the external add-ins to the networking
stack have never been reliable worth a damn (masquerading, CIPE,
Free/SWAN, redirection etc) and there have been holes in our packet
filtering (early 2.0 iirc). The network code simply isn't readable on
any scale larger than a single function, and it's not documented.

I have a huge respect for the hacking skills of the IP maintainers,
but don't be under any illusions that outsiders can make mods to the
500,000 lines of networking code. They can't.

(Yeah, I know, the netfilter hooks added to the unreadability of the
code...)
Rusty.

--
Hacking time.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/