Re: File-descriptors - large quantities

Michael O'Reilly (michael@metal.iinet.net.au)
Wed, 08 Jul 1998 13:38:36 +0800


In message <35A2FE83.32D7C7B1@brisnet.org.au>, Dancer writes:
> > You'd be _way_ better off going to squid 1.2.
>
> I'm on the squid-dev list. I test and run 1.2-beta caches on a
> test-bench server, mimicing the sorts of requests that we get on our
> production machines. Too unstable for our needs. We would lose face and
> testicles. Big time.

We run two 1.2beta22 squids in production use. Restarted once per 24
hrs with two little patches in. runs fine.

> We're hanging directly off the Brisbane Telstra POP (several metres from
> the router). Yes, I would say that - at times - a piece of wet string
> would be preferable. We hit our problems during upstream network chokes
> that officially "don't happen".

What size link is this?

> > I strongly suspect your squid is bogged down on disk requests causing
> > the number of requests outstanding grow to silly amounts. Moving to
> > squid 1.2beta22 will dramatically improve your diskk thruput and
> > corresponsibly reduce the average request time-to-complete.
>
> No, we don't have disk-choking problems. The disk throughput is still
> way short of maxed out.

If you're even coming close to using 3000 FDs then I can guarentee
that your disk thruput is maxed out for squid 1.1.

On disks with 8ms seek times, your absolute maximum I/O rate is 125
per second no matter how big your disk array is. (this is because
squid1.1 is a single process, so can only have a single disk request
outstanding at any one time).

Unless you're running very small caches, or have a disk cache around
1% of your total disk array size, then you're be getting real rates of
less than 1/2 that (you'll incur a seek for the metadata as well as
the data seek for the data. In practice it'll be worse, cos you need
to handle the directory lookup as well). Say the best case, on very
fast disks, is around 60tps. Note that I still haven't figured
overhead for the SCSI transaction or anything else in here.

Now, cache hits will incur a disk read, cache misses will incur a disk
write (except for uncachable data at around 30% of requests). So given
a max disk transaction rate of 60 per second, and adding a genorous
20% for hot cache hits, you can't handle more than 90 requests/second
without becoming disk bound.

This translates to a 33 _second_ average request length to use 3000
file descriptors.

For comparision: Real numbers of a proxy cache that isn't disk bound
put the average request time at under 5 seconds. And this is in
australia too. :)

So unless your external links are underprovisioned by a factor of
about 7, you're disk bound.

You can watch this to a certain extent by doing
ps uaxr
and seeing how often squid come up in run state 'D'. My guess is it'll
be in that state nearly 100% of the time.

Moving to squid1.2 changes the picture dramatically. Squid1.2 can have
up to 16 requests outstanding by default so the kernel gets to issues
requests in parallel to different disks, and do request sorting,
pushing the max tps to something like 400tps for a 6 disk
array. Adding 'hot' directory handling pushing the max requests/second
to the 1000 and up range, meaning that you start getting CPU bound
instead of disk bound.

Michael.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu