Re: networking / web perf probs

Marc Slemko (marcs@znep.com)
Sun, 14 Dec 1997 00:04:28 -0700 (MST)


On Sat, 13 Dec 1997, Larry McVoy wrote:

> I was at a web conference last week and there was a paper presented that
> attempted to claim that there is no performance problem with web servers.
> Silly, I know. Fortunately, Jeff Mogul had a look at it and did a 10
> minute rebuttal that showed what the problem was, why it would appear
> that the servers were not loaded, and what the fix was. It's a BSD
> problem so maybe it doesn't exist in Linux - I just want to make sure.
>
> The problem is that when a web server has more than a certain number
> of packets in the input queue, new packets will just get dropped.
> It isn't so bad if one of the new packets is a data packet, but it is
> horrible if one of the dropped packets is the connection setup SYN.
> The retransmit timeout is an exponential backoff, starting at 5 seconds.
> This is why lots of people hit the "STOP" button on their browser and
> reload and that works better.

Sometimes. This really is only a problem on obsolete operating systems
(unfortunately, many are in common use) or with old servers.

It isn't actually an input queue for all packets, just for incomplete
connections. This varies slightly from OS to OS or may even be gone
entirely in some situations.

> A server in this situation is not necessarily out of CPU, in fact it is
> quite likely that the server is quite idle. The resource in question
> is the input packet queue, not CPU cycles. Typical BSD based systems
> suffering from this problem are usually 90+% idle.
>
> The simple fix is to crank up the input queue. SGI cranked theirs
> to 512 packets per queue (and there is a queue per CPU). DEC cranked

Erm... I'm not sure that this is the same thing. Are you sure about a
queue per CPU? That doesn't make sense.

> theirs as well (anyone have OSF/1 header files out there to figure out
> how high it is?).
>
> Another part of the fix is to have
>
> listen(sock, 0)
>
> work like normal, but
>
> listen(sock, >0)
>
> should be changed (in the kernel) to be something like
>
> listen(sock, sizeof(input queue length))

I would think that is a bad idea. Some programs deliberately set a low
listen backlog and have valid reasons for it.

>
> There are a lot of leftover programs that think a back log of 5 is
> reasonable. Those programs are naive.

Sure, but they aren't up to handling modern web traffic anyway so what
does it matter? Current versions of Apache, for example, default to a
backlog of 511 (no, not 512; yes, there is an obvious reason) which can be
tuned using the ListenBacklog directive.

It is easy to fix any other programs that you have source for.

The biggest problem related to this is that many old OSes impose an overly
restrictive lower limit so even if a program gives a large number to a
listen() call, the OS may force it to something like 5. This does cause
problems.

This has all changed in the world of code to deal with SYN floods too...