TCP connection issues against Amazon S3

From: Erik Grinaker
Date: Tue Jan 06 2015 - 10:14:48 EST


(CCing Yuchung, as his name comes up in the relevant commits)

After upgrading from Ubuntu 12.04.5 to 14.04.1 we have begun seeing intermittent TCP connection hangs for HTTP image requests against Amazon S3. 3-5% of requests will suddenly stall in the middle of the transfer before timing out. We see this problem across a range of servers, in several data centres and networks, all located in Norway.

A packet dump [1] shows repeated ACK retransmits for some of the requests. Using Ubuntu mainline kernels, we found the problem to have been introduced between 3.11.10 and 3.12.0, possibly in 0f7cc9a3c2bd89b15720dbf358e9b9e62af27126. The problem is also present in 3.18.1. Disabling tcp_window_scaling seems to solve it, but has obvious drawbacks for transfer speeds. Other sysctls do not seem to affect it.

I am not sure if this is fundamentally a kernel bug or a network issue, but we did not see this problem with older kernels.

[1] http://abstrakt.bengler.no/tcp-issues-s3.pcap.bz2--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/