Re: TCP connection issues against Amazon S3

From: Erik Grinaker
Date: Tue Jan 06 2015 - 14:51:13 EST


On 06 Jan 2015, at 19:16, Rick Jones <rick.jones2@xxxxxx> wrote:
>
>>>>>>> A packet dump [1] shows repeated ACK retransmits for some of the
>> TCP does not retransmit ACK ... do you mean DUPACKs sent by the receiver?
>>
>> I am trying to understand the problem. Could you confirm that it's the
>> HTTP responses sent from Amazon S3 got stalled, or HTTP requests sent
>> from the receiver (your host)?
>>
>> btw I suspect some middleboxes are stripping SACKOK options from your
>> SYNs (or Amazon SYN-ACKs) assuming Amazon supports SACK.
>
> The TCP Timestamp option too it seems.
>
> Speaking of middleboxes... It is probably a fish that is red, but a while back I stepped in a middle box (a load balancer) which decided that if it saw "too many" retransmissions in a given TCP window that something was seriously wrong and it would toast the connection. I thought though that was an active reset on the part of the middlebox. (And the client was the active sender not the back-end server)

Itâs looking increasingly probable that itâs something like that, since the sender (S3) appears to disable SACKs on the failing clients, while it enables SACKs on other functioning clients.

> I'm assuming one incident starts at XX:41:24.748265 in the trace? That does look like it is slowly slogging its way through a bunch of lost traffic, which was I think part of the problem I was seeing with the middlebox I stepped in, but I don't think I see the reset where I would have expected it. Still, it looks like the sender has an increasing TCP RTO as it is going through the slog (as it likely must since there are no TCP timestamps?), to the point it gets larger than I'm guessing curl was willing to wait, so the FIN at XX:41:53.269534 after a ten second or so gap.

Yes, there is one incident starting at XX:41:23. All the RSTs are sent at the end though, at the 30s Curl timeout. Iâve put up a stripped down pcap of a single request here:

http://abstrakt.bengler.no/tcp-issues-s3-failure.pcap.bz2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/