Bonding driver has bad load balancing for forwarded traffic, 3.7+

From: Vitaly V. Bursov
Date: Mon Apr 15 2013 - 10:06:36 EST


Hello,

I have a bonding device (mode=802.3ad xmit_hash_policy=layer2+3 miimon=300) and
for kernels <3.7 forwarded IPv4 traffic distributed fine across multiple physical
links. Ethernet cards are Intel 82576 with igb driver (various versions).

3.7 and 3.8 kernels tend to fully utilize only one link and leave the others almost idling.

Replacing bond_xmit_hash_policy_* functions with older ones (3.6 kernel) looks like
resolves the issue (but I haven't tested it thoroughly).

So, I added
printk(KERN_INFO "hash_policy: protocol = %d, skb_network_header_len = %d, %d %d\n",
skb->protocol, skb_network_header_len(skb),
skb_headlen(skb), skb_network_offset(skb));
to bond_xmit_hash_policy_l23() of bond_main.c

and got this:
[ 65.280831] hash_policy: protocol = 8, skb_network_header_len = 0, 74 14
[ 65.280835] hash_policy: protocol = 8, skb_network_header_len = 0, 74 14
[ 65.280839] hash_policy: protocol = 8, skb_network_header_len = 0, 74 14
[ 65.280843] hash_policy: protocol = 8, skb_network_header_len = 0, 74 14
[ 65.280847] hash_policy: protocol = 8, skb_network_header_len = 0, 74 14
[ 65.280851] hash_policy: protocol = 8, skb_network_header_len = 0, 74 14

It's clear that the new check condition (skb_network_header_len(skb) >= sizeof(*iph))
fails here and hash policy fallbacks to l2 balancing.

I have no idea how to fix this besides removing this check completely, any
help would be appreciated.

--
Thanks
Vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/