Re: [PREEMPT-RT] [patch 4 14/22] timer: Switch to a non cascading wheel

From: Eric Dumazet
Date: Tue Aug 16 2016 - 10:36:18 EST


On Tue, Aug 16, 2016 at 5:46 AM, Richard Cochran
<richardcochran@xxxxxxxxx> wrote:
> Jouni,
>
> If I understand the test correctly, then the slightly different kernel
> timer behavior is ok, but the test isn't quite right. Let explain
> what I mean.
>
> First off, reading test_ap_wps.py, the point of the test is to see if
> ten simultaneous connections are possible. I guess the server
> implements a hard coded limit on the number of clients. (BTW where is
> the server loop?)
>
> You said that the server also sets 'backlog' to ten. The backlog
> controls the size of the queue holding incoming connections that are
> in the SYN_RCVD or ESTABLISHED state but have not yet been
> accept(2)-ed by the server. This is *not* the same as the number of
> possible simultaneous connections.
>
> On Sat, Aug 13, 2016 at 12:12:26PM +0300, Jouni Malinen wrote:
>> Yes, it looks like a TCP connect() timeout. I use a significantly
>> reduced timeout in the test scripts since they are run unattended and
>> are supposed to terminate in reasonable amount of time.. That said,
>
> I did not find where the client sets the one second timeout. Where
> does this happen?
>
>> If I increase that 20 to 50, I get more of such about 1.03 second
>> results at i=17, i=34, i=48..
>
> Can you provide the timings when the test runs on the older kernel?
>
>> Looking more at what exactly is happening at the TCP layer, this is
>> likely related to the server behavior since listen() backlog is set to
>> 10 and if there are 10 parallel connections, the last one if
>> immediately closed before reading anything.
>
> To clarify, when the backlog is exceed, the new connection is not
> closed. Instead, the SYN is simply ignored, and the client is expect
> to re-transmit the SYN in the normal TCP fashion.
>
>> Looking at a sniffer capture (*), the three-way TCP connection goes
>> through fine for the first 15 connect() calls, but the 15th one does
>> not get a response to SYN. This SYN is the frame 47 in the capture
>> file with srcport == 60802. There is no SYN,ACK for it. The about one
>> second unexpected time for connect() comes from this, i.e., the
>> connection is completed only after the client side does TCP
>> retransmission of the SYN (frame #77) a second later and the server
>> side replies with RST,ACK (frame #78).
>
> This is the expected behavior.
>
>> So it looks like the issue is in one of the SYN,ACK frames getting
>> completely lost..
>
> No, the frame is not missing. It was never sent because the backlog
> was exceeded.
>
> Here is what I suspect is happening. By sending 20 SYN frames to a
> port with a backlog of 10, it saturates the queue. One SYN is ignored
> by the kernel, and a race begins between the connect() timeout and the
> SYN re-transmission. If the client's re-transmitted SYN and then the
> server's SYN,ACK returns before the connect timeout, then the call to
> connect() succeeds. With the new timer wheel, the result of the race
> is different.
>
> There a couple of ways to deal with this. One is to increase the
> backlog on the server side. Another is to increase the connect()
> timeout to a multiple of the re-transmission interval.
>
> Thoughts?
>

I am coming late to the party, but yes, test looks flaky.

(Relying on having very precise SYN retransmits when listen backlog on
server side is full)