[patch] ts_recent in tsecr of syn-ack [Re: Weird tcp performance differences with 2.0 and 2.2 kernel

Andrea Arcangeli (andrea@e-mind.com)
Sat, 13 Feb 1999 18:04:16 +0100 (CET)


On Sat, 13 Feb 1999, Mike Galbraith wrote:

>Do you have a pointer to the rfc? I need to do some serious reading and
>head scratching to understand what you just said.

andrea@laser:~$ grep ftp /usr/doc/doc-rfc/copyright
It was put together from <ftp://ftp.germany.eu.net/pub/documents/rfc>,
<ftp://venera.isi.edu/in-notes/rfc-editor>, and
andrea@laser:~$

>I had stated that this was a recent thing.. damn lie that. I started
>a binary search to see (curiosity) when this started, and ran into it
>immediately in stock 2.1.108. I guess it's been there for a long while,

You mean that everything is fine in 2.1.107? If so could you send me the
TCP diffs included in patch-2.1.108?

>918804847.288893 168.100.187.42.20 > 193.203.186.200.1159: S 3082754207:3082754207(0) win 16384 <mss 1460,nop,wscale 0,nop,nop,timestamp 3027206 0,nop,nop,ccnew 43642> (DF) (ttl 46, id 4950)
>918804847.290704 193.203.186.200.1159 > 168.100.187.42.20:
>S 2264398605:2264398605(0) ack 3082754208 win 32120 <mss 1460,nop,nop,
>timestamp 1957148 0,nop,wscale 0> (DF) (ttl 64, id 10040)
^ this is the bit I am talking about

porcupine doesn't put a 0 in such place but it would put the timestamp
3027206 instead. This looks the natural way to handle timestamps but after
my yesterday very (and only) short read of the RFC it seems not strictly
requested so I can't tell you that Linux is buggy putting _always_ a 0 in
the tsecr of every syn-ack.

>918804847.836983 168.100.187.42.20 > 193.203.186.200.1159: . ack 1 win 17376 <nop,nop,timestamp 3027207 1957148> (DF) (ttl 46, id 4952)
>918804848.671411 truncated-ip - 7 bytes missing!168.100.187.42.20 > 193.203.186.200.1159: . 1:1449(1448) ack 1 win 17376 <nop,nop,timestamp 3027207 1957148> (DF) (ttl 46, id 4954)

This is something of crazy and should be a bug in porcupine or in the
middle man.

>and porcupine is waiting for my ack. Does it look like receiver troubles
>to you?.. or is this just coincidence and tcpdump is being a PITA? It

Don't know. Trusting the SOCK_PACKET code looks like it's a sender bug to
me.

>Anyone see exactly this problem on a _non_ isdn connection?

Really yesterday I tried an ls in /pub/security and it was very slow (as
_everything_ else from here) but worked after a while.

Would it be possible that porky (or some other men in the middle) when
receive the syn-ack it just does (jiffies - tsecr) and use the computed
value to set a bogus RTO and/or to do some other mess? I don't think it's
the case but to give a try I changed the 2.2.2-pre2 linux TCP stack to do
exactly what porcupine does (reverse engeneered from the tcpdump, I don't
have the time to see which OS is running porcupine) in the synack
timestamp case.

You may want to give a try to this my patch and see if the bad behavior
will go away.

Index: include/net/tcp.h
===================================================================
RCS file: /var/cvs/linux/include/net/tcp.h,v
retrieving revision 1.1.2.2
diff -u -r1.1.2.2 tcp.h
--- tcp.h 1999/01/18 14:16:27 1.1.2.2
+++ linux/include/net/tcp.h 1999/02/13 16:06:37
@@ -912,7 +912,7 @@
* can generate.
*/
extern __inline__ void tcp_syn_build_options(__u32 *ptr, int mss, int ts, int sack,
- int offer_wscale, int wscale, __u32 tstamp)
+ int offer_wscale, int wscale, __u32 tstamp, __u32 ts_recent)
{
/* We always get an MSS option.
* The option bytes which will be seen in normal data
@@ -936,7 +936,7 @@
*ptr++ = __constant_htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) |
(TCPOPT_TIMESTAMP << 8) | TCPOLEN_TIMESTAMP);
*ptr++ = htonl(tstamp); /* TSVAL */
- *ptr++ = __constant_htonl(0); /* TSECR */
+ *ptr++ = htonl(ts_recent); /* TSECR */
} else if(sack)
*ptr++ = __constant_htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) |
(TCPOPT_SACK_PERM << 8) | TCPOLEN_SACK_PERM);
Index: net/ipv4/tcp_output.c
===================================================================
RCS file: /var/cvs/linux/net/ipv4/tcp_output.c,v
retrieving revision 1.1.2.4
diff -u -r1.1.2.4 tcp_output.c
--- tcp_output.c 1999/01/24 02:46:31 1.1.2.4
+++ linux/net/ipv4/tcp_output.c 1999/02/13 16:17:03
@@ -30,6 +30,7 @@
* David S. Miller : Charge memory using the right skb
* during syn/ack processing.
* David S. Miller : Output engine completely rewritten.
+ * Andrea Arcangeli: SYNACK carry ts_recent in tsecr.
*
*/

@@ -135,7 +136,8 @@
(sysctl_flags & SYSCTL_FLAG_SACK),
(sysctl_flags & SYSCTL_FLAG_WSCALE),
tp->rcv_wscale,
- TCP_SKB_CB(skb)->when);
+ TCP_SKB_CB(skb)->when,
+ tp->ts_recent);
} else {
tcp_build_and_update_options((__u32 *)(th + 1),
tp, TCP_SKB_CB(skb)->when);
@@ -862,7 +864,8 @@
TCP_SKB_CB(skb)->when = jiffies;
tcp_syn_build_options((__u32 *)(th + 1), req->mss, req->tstamp_ok,
req->sack_ok, req->wscale_ok, req->rcv_wscale,
- TCP_SKB_CB(skb)->when);
+ TCP_SKB_CB(skb)->when,
+ req->ts_recent);

skb->csum = 0;
th->doff = (tcp_header_size >> 2);

Let me know (I don't have the time to try myself).

Andrea Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/