Re: Thread implementations...

Alex Belits (abelits@phobos.illtel.denver.co.us)
Fri, 26 Jun 1998 00:50:30 -0700 (PDT)


On Fri, 26 Jun 1998, Chris Wedgwood wrote:

> > know is limited to sending entire files verbatim onto a socket.
>
> sendfile is most useful when you can append or prepend data constructed from
> an iovec.

But what will be the difference between sendmsg(2)/writv(2) with that
syscall and syscall with option to prepend/append data? It's not like
writv will copy less data when combined with such syscall.

>
> > > fd-in - file descriptor to disk file (? is this needed as a restriction)
> > > fd-out - file descriptor to any object
> > > options - eg, close on send completion, sync/async. etc.
> >
> > All these options aren't pretty. I would hope you could do
> > without them. If sendfile blocks until it is complete, there
> > seems little point in having a close-on-send, and if you look
> > at Deans website, you can see that the question of closing a
> > TCP/IP socket properly is non-trivial, and probably not for
> > kernel-space-only.
>
> In the case of web servers and proxy caches, I don't think close on send is
> worth while, because it defeats the purpose and the (possibly very great)
> win experienced with HTTP/1.1.
>
> I am perhaps overstating pipelining in HTTP/1.1, because I've measured this
> on active proxies and 40% of all requests come from single connections, but
> this value is likely to reduce as browsers get smarter.

Browsers often do HTTP 1.0 with Keep-Alive, that doesn't close the
connection either.

>
> (As is, by default, IE 4 and stuff don't do HTTP/1.1 though proxies because
> the piece of shit MS-Proxy 2 doesn't support this, which means the rest of
> the world gets penalized because M$ can't write decent code as per usual).

HTTP 1.1 actually is a pain to implement right in a lot of places.
However close after send is what my server always does even with
Keep-Alive connection -- one process keeps fd open and passes it to other
processes when necessary, and those processes close fds after every
request, thus creating a pool of server processes that are ready even if
there are a lot of connection open and idle. Of course, "main" process can
suffer from poll(2) scalability problem and fd number limit, but this is
another issue.

> Dean knows about 50 times what I do in this area, so its probably best he
> jumps in and points and where I'm wrong.
>
> > Possibly. But according to Squid's own doc, the primary problems
> > in Squid performance are
> >
> > 1) Not enough memory
> > 2) Too slow disks
>

[skipped]

>
> Anyhow, sendfile could help here by eliminating the costly copy to/from
> userspace, which is the real limiting factor if you have really fast disks,
> really fast network and are trying to stream the data out.
>
> > CPU speed for cgi, bandwidth to the net, bandwidth on your Ethernet, etc.
> > Linux/Apache can easily saturate a 10mbits Ethernet already, how many
> > people actually have a faster connection to the net?

MS-DOS and a program that always returns the same page can saturate any
imaginable network interface, too. That doesn't mean, it will work well
for the real situation with large number of clients expecting their
requests to be served fast.

> IntrAnet's can and do use 100mb/s, server can have multiple interfaces. I
> used to have a machine setting on three T3s.
>
> > A T3 is tens of thousands of dollars a month. Compared with this,
> > a couple of Linux boxes, which give you redundancy as well are
> > nothing.
>
> More boxes - more to go wrong, more to administer, etc.
>
> And as I said, people have fast internal networks, and there are
> applications like NFS which could benefit (although to a much lesser extent
> with knfsd).
>
> > That's cool, as long as we make something that the application
> > writers can use. If we depart from Posix, we'd better get it right.
>
> POSIX doesn't specific lots of APIs we already have.

...but BSD and SysV mostly do.

> sendfile is available
> on other hardware, and if we can support it cheaply, why not? (Bloat
> arguments off the list please).

Bloat or not, the design of unixlike systems was always based on the
most generic interfaces, implemented in a form, where they can be combined
easily (for example, the use of file descriptors). If generic soltion can
be efficient, but will benefit more applications, it's better than
specific solution for every of them even if there is no actual size
increase, and it's more fair to developers of other things that need
performance improvements but can't be loud enough to influence kernel
development. If sendfile() will be implemented, it should *not* have the
only purpose of increasing a performance of some type of applications,
unless the benefit of it over generic solution (based on madvise(2) that
is generic and has large number of uses that have nothing to do with
sending files) will be so huge that it will justify such an ugly
interface.

Otherwise what will be next, grep(2)? find(2)? bash(2)? perl(2)?

--
Alex

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu