Re: 100Mbps TCP stalls in 2.1.115

Zlatko Calusic (Zlatko.Calusic@CARNet.hr)
09 Aug 1998 02:36:49 +0200


Andi Kleen <ak@muc.de> writes:

> Zlatko Calusic <Zlatko.Calusic@CARNet.hr> writes:
> >
> > E. g. FTP between two 100mbit hosts, transfer rate is ~50kb/sec, but
> > many times it recovers after few (or few tens of) seconds.
> >
> > Sometimes it works, sometimes not. Tcpdump logs available on request.
> >
> > Any clue?
>
> What does /proc/net/netstat say after such a stall?

{atlas} [~]% cat /proc/net/netstat
TcpExt: SyncookiesSent SyncookiesRecv SyncookiesFailedEmbryonicRsts
TcpExt: 0 0 0 0

> Also could you try to reproduce it with sockets that have the SO_DEBUG
> flag set ?
>

I should take my favorite TCP application and recompile with one
additional setsockopt in code, am I right?
I never did SO_DEBUG's. :)

I noticed that stalls are periodical, with period of about 2-3
minutes.

I made a long ping session, and came up with this:

PING div.srce.hr (161.53.3.13): 56 data bytes
64 bytes from 161.53.3.13: icmp_seq=0 ttl=253 time=1.3 ms
64 bytes from 161.53.3.13: icmp_seq=1 ttl=253 time=1.1 ms
64 bytes from 161.53.3.13: icmp_seq=2 ttl=253 time=1.1 ms
64 bytes from 161.53.3.13: icmp_seq=3 ttl=253 time=1.0 ms
64 bytes from 161.53.3.13: icmp_seq=4 ttl=253 time=1.1 ms
... everything's ok, when ping times are around 1ms
64 bytes from 161.53.3.13: icmp_seq=147 ttl=253 time=1.0 ms
64 bytes from 161.53.3.13: icmp_seq=148 ttl=253 time=1.0 ms
... from now on, things are dog slow...
64 bytes from 161.53.3.13: icmp_seq=150 ttl=253 time=2.1 ms
64 bytes from 161.53.3.13: icmp_seq=151 ttl=253 time=2.0 ms
64 bytes from 161.53.3.13: icmp_seq=152 ttl=253 time=2.0 ms
64 bytes from 161.53.3.13: icmp_seq=153 ttl=253 time=2.1 ms
64 bytes from 161.53.3.13: icmp_seq=154 ttl=253 time=2.1 ms
64 bytes from 161.53.3.13: icmp_seq=155 ttl=253 time=2.1 ms
64 bytes from 161.53.3.13: icmp_seq=156 ttl=253 time=2.1 ms
64 bytes from 161.53.3.13: icmp_seq=157 ttl=253 time=74.9 ms
64 bytes from 161.53.3.13: icmp_seq=158 ttl=253 time=28.1 ms
64 bytes from 161.53.3.13: icmp_seq=161 ttl=253 time=2.2 ms
64 bytes from 161.53.3.13: icmp_seq=163 ttl=253 time=2.1 ms
...
64 bytes from 161.53.3.13: icmp_seq=335 ttl=253 time=2.1 ms
64 bytes from 161.53.3.13: icmp_seq=336 ttl=253 time=2.0 ms
... from now on, things are working correctly again...
64 bytes from 161.53.3.13: icmp_seq=337 ttl=253 time=1.1 ms
64 bytes from 161.53.3.13: icmp_seq=338 ttl=253 time=1.1 ms
... and so on...

Notice I cut a long listing into relevant pieces (watch those icmp_seq
numbers)!

Could this mean that some counter is under/overflowing or something
similar?

Unfortunately, I can't boot older kernel right now, cause I'm not
sitting at the machine, and wouldn't want to push my luck... :)

Thanks for your attention!

-- 
Posted by Zlatko Calusic           E-mail: <Zlatko.Calusic@CARNet.hr>
---------------------------------------------------------------------
	     Do not put statements in the negative form.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html