Re: Odd network packet counts being reported by NetXen 10G driver
From: Bill Fink
Date: Tue Apr 29 2008 - 23:44:44 EST
On Tue, 29 Apr 2008, Mark Seger wrote:
> I was running some network tests taking samples every 10 seconds on an
> nfs server connected to high-speed storage and a 10G NIC. The data was
> read with collectl, which reads the data from /proc/net/dev. When I
> look at the average packet sizes I come up with over 6K and jumbo frames
> are not enabled. Naturally my first suspicion was collectl's math so I
> went back to the original numbers and here's what I collected - the
> 'Net' at the beginning of each line is something I preface each line
> with so I can tell which file it came from as I collect data from lots
> of places. Anyhow, here are a couple of samples (don't know how they'll
> show up after terminal wrapping):
>
> Inter-| Receive
> | Transmit
> face |bytes packets errs drop fifo frame compressed
> multicast|bytes packets errs drop fifo colls carrier compressed
> Net eth2:1033991246323 770657768 0 885 0 0
> 0 0 209226011736 398954938 0 0 0 0 0 0
> Net eth2:1034142397679 773022462 0 885 0 0
> 0 0 215709166026 399913993 0 0 0 0 0 0
> Net eth2:1034294532107 775402505 0 886 0 0
> 0 0 222234389176 400880022 0 0 0 0 0 0
>
> If I do the math for the first 2 samples it comes out as
> 215709166026-209226011736=6483154290 bytes and
> 399913993-398954938=959055 packets. The result of
> 6483154290/959055=6760 bytes/packet. And this is not just for a couple
> of samples. Here's an example of collectl's output:
>
> # Num Name InPck InErr OutPck OutErr Mult ICmp
> OCmp IKB OKB
> 13:00:00 3 eth2: 236469 0 95905 0 0 0 0
> 14760 633120
> 13:00:10 3 eth2: 238004 0 96602 0 0 0 0
> 14856 637228
> 13:00:20 3 eth2: 238796 0 92460 0 0 0 0
> 14906 639296
> 13:00:29 3 eth2: 230974 0 93103 0 0 0 0
> 14419 618655
> 13:00:40 3 eth2: 234701 0 93414 0 0 0 0
> 14651 628492
> 13:00:50 3 eth2: 236121 0 94527 0 0 0 0
> 14739 632191
> 13:00:59 3 eth2: 237001 0 92259 0 0 0 0
> 14795 634433
>
> note that these numbers as reported as KBs/sec and so have been divided
> by 10 from the raw numbers shown in /proc/net/dev.
> If I take a look at an earlier part of the same test when I'm doing
> writes (the system sees them as InPck/IKB I see:
>
> # Num Name InPck InErr OutPck OutErr Mult ICmp
> OCmp IKB OKB
> 12:30:00 3 eth2: 268813 0 142581 0 0 0 0
> 383735 9100
> 12:30:10 3 eth2: 258240 0 136903 0 0 0 0
> 368605 8734
> 12:30:20 3 eth2: 283324 0 150269 0 0 0 0
> 404486 9587
> 12:30:30 3 eth2: 278165 0 147478 0 0 0 0
> 396937 9405
> 12:30:40 3 eth2: 280848 0 148938 0 0 0 0
> 400896 9506
>
> and if I do the math on IKB (383735*1024/268813)=1462 which makes a
> whole lot more sense. Any ideas as to what would cause the driver
> incorrectly count the packets? I know the byte counts are correct
> because this is an nfs server and the disk i/o rates are consistent
> with the network rates.
Just guessing, but perhaps on the transmit side TSO is causing
packet aggregation to the NIC driver. You could try a test with
disabling TSO on eth2 by:
ethtool -K eth2 tso off
Of course this might have some performance ramifications.
-Bill
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html