Re: W/RTT verification, linux tcp buffers behaviour
From: Stephen Hemminger
Date: Fri May 12 2006 - 11:51:14 EST
On Fri, 12 May 2006 10:18:13 +0200
"Constantinos Makassikis" <cmakassikis@xxxxxxxxx> wrote:
> Here's my problem:
>
> I am trying to verify the formula :
>
> W/RTT = Max Throughput
>
> between two end-hosts belonging to the same private network.
>
> Where:
>
> RTT stands for Round Trip Time
> W = min(CWND, AW, SNDBUF)
> CWND : the size of the congestion window
> AW : the size of the receiver's advertized window
> SNDBUF : the size of the send buffer
>
> In order to do this I have made various bandwidth measurements between the
> two hosts. More particularly I have fixed the receiver's send buffer
> to 16 MBytes whereas I have made the sender's buffer vary between 8 KBytes
> and 16 MBytes.
>
> Both hosts, as it can be seen below, are good machines which are linked
> to the private network through Gigabit Ethernet.
>
> Hosts' configuration:
> --------------------------
>
> Debian Linux 2.6.12-1-amd64-k8-smp
> AMD Opteron 246/248
> 2GB RAM
> 80 GB HDD
> Gigabit Ethernet
> Tcp specific options that are set via sysctl can be found at the end
> of this letter.
>
>
> As for the network itself it appears to be of excellent quality since
> during the whole experiments no retransmitted is reported and the RTT
> ranges between 12 and 13 milliseconds.
>
> Normally, one shouldn't expect to approach very closely W/RTT but given
> the quality of both network (no losses and very stable RTT) and end hosts
> it is surprising to get at best only 70 % of W/RTT (see below for results).
>
> Bandwidth is measured with Iperf tool
> Tcp buffer sizes are set with Iperf tool (via setsockopt() )
> Traffic is dumped with tcpdump on both end hosts
> Traffic statistics from tcpdump traces are provided by tcptrace tool
>
> The tcpdump's traces which are made for each transfer confirm network quality.
>
> Here are some figures:
>
> RTT SNDBUF RCVBUF MAX SND MAX AW Iperf W/RTT %
> ----------------------------------------------------------------------------------------
> 12,7 8 16384 8 6293248 3,57 5,16
> 69,18
> 12,7 16 16384 10,76 6293248 6,9 10,32 66,85
> 12,7 32 16384 21,4 6293248 13,7 20,64 66,37
> 12,7 64 16384 31,5 6293248 26,7 41,28 64,67
> 12,7 128 16384 49,5 6293248 54,4 82,56 65,88
> 12,7 256 16384 213 6293248 105 165,13 63,58
> 12,7 512 16384 266 6293248 171 330,26 51,77
> 12,7 1024 16384 - 6293248 382 660,52 57,83
> 12,7 2048 16384 - 6293248 673 1321,04 50,94
> 12,7 4096 16384 - 6293248 905 2642,08 34,25
>
> RTT : round trip time
> (milliseconds)
> SNDBUF : size of tcp send buffer
> (KBytes)
> RCVBUF : size of tcp receive buffer
> (KBytes)
> MAX SND : the average amount of data send per RTT (KBytes)
> MAX SND is estimated from the tcpdump traces.
> MAX AW : maximum size of the advertized window (KBytes)
> provided by tcdump's traces
> Iperf : Throughput reported by Iperf tool
> (Mbits/sec)
> W/RTT : Max Throughput reachable
> (Mbits/sec)
>
> Even though only the maximum size of the advertized window is reported,
> actually the size of the advertized window grows in a few RTT greater
> than SNDBUF, thus I assumed safe to take W = min(CWND, SNDBUF) and since no
> retransmissions are detected, the CWND grows beyond the size of SNDBUF and so
> I took W =SNDBUF to compute W/RTT.
>
> As it can be seen, we hardly reach 70 % of the value predicted by the formula
> and apparently it seems that it is due to the fact that MAX SND
> remains relatively
> low compared to SNDBUF.
>
> Hereafter lie some questions.
>
> Questions:
> --------------
>
> 1) Am I missing or misunderstanding something ?
Linux does autotuning of send and receive buffer size.
> 2) Do you have any other ideas which could explain the low percentage reached ?
Max is limited by tcp_rmem/tcp_wmem, read Documenation/networking/ip-sysctl.txt
> 3) Supposing the low percentage is really due to the fact that
> sender's buffer isn't
> fully used, why isn't it used to its fullest ?
> Is there some way to overcome this ?
>
> Misc Questions:
> --------------------
>
> i.e.: questions I tried to answer myself by searching around the
> internet but for which I didn't find any satisfactory answer or any
> answer at all.
>
> 4) Why is the advertized window steadily growing until it reaches 6
> MBytes instead of being given directly a size of 6 Mbytes at the
> beginning of the connection ?
Slow start and autotuning.
> 5) Why does the advertized window remain stuck at 6 MBytes ?
tcp_wmem
> 6) Why does the kernel allocate twice the size of the buffer size
> requested by setsockopt ?
>
> Thank you in advance,
>
> Constantinos
>
>
> ###################
> # /etc/sysctl.conf #
> ###################
>
# increase Linux autotuning TCP buffer limits to 64M
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html