Re: Preparations for ZD's upcoming Apache/Linux benchmark - Why not

Brian Macy (bmacy@sunshinecomputing.com)
Thu, 10 Jun 1999 08:51:35 -0700


Hmmm... I'm not much of an http guru but as a developer this idea just
smells like the best one I've heard just from a design standpoint.
Having the cache be userland controlled certainly adds to the
flexibility and simplicity of the kernel side of things.

Brian Macy

Dean Gaudet wrote:
>
> On Wed, 9 Jun 1999, Ricardo Galli Granada wrote:
>
> > Yes, the proposal should go to web server developers, not to linux-kernel.
>
> It's funny, almost a year ago, to the day, I wrote up something I called
> an "HTTP flow cache". It's similar in concept to cisco flow caching, or
> to the route cache in 2.2.x, or to the dcache.
>
> Essentially, you start with an empty cache. Cache misses fail to a
> complete HTTP implementation (i.e. apache). The HTTP implementation
> populates the cache with "pattern -> (header, body)" mappings. The
> patterns are chosen so that a match won't violate HTTP/1.1 semantics.
>
> A pattern is something like:
>
> the request-url matches /this/url
>
> and the "Host:" header is in the set {
> "foobar.com"
> }
>
> and the "Accept:" header is in the set {
> "image/jpeg",
> "image/jpeg, image/gif",
> "*/*"
> }
>
> then serve /this/file with these <fill-in-the-blank> headers
>
> For example, a full content-negotiation algorithm is a pain in the
> ass... but after you've seen the Accept* headers from a browser, and
> served a response, you know how to negotiate requests from other browsers
> with the same Accept* headers. So in the example above you would
> continue to add elements to the set as you discovered more browsers.
>
> The next trick is the simplification that eliminates the need to have a
> copy of all those "accept sets" and such for every URL. This involves a
> simple observation about how webservers are configured -- for apache
> in particular... there's a hierarchy formed by the <Directory> et
> al containers. And any two resources which shares the same set of
> <Directory> matches can share the same patterns in the flow cache.
> (Not 100% true, but you're probably not all http geeks, so it's pointless
> to go into full detail :)
>
> At any rate, these patterns can be used to handle pretty much all
> standard HTTP usage. They essentially say "if a request matches one
> of the patterns, then we can serve it quickly without going through the
> full processing; otherwise we bail and do full processing".
>
> This is a fine and dandy theoretical concept... but is it useful in
> practice?
>
> I don't know.
>
> But what I do know is that you shouldn't be putting policy into the
> kernel... such as mappings between extensions and content-types. So my
> suggestion is that the kernel behave like the flow cache, in that it
> bails to userland on any request it has never seen before.
>
> Also, I didn't look to see if kHTTPd supports this already ... but the
> Host: header is essentially part of the URL, and ignoring it is
> problematic. HTTP sucks, repeat it again, HTTP sucks.
>
> I stopped work in this direction because I came to the realisation that
> nobody needs this. It's a benchmark gimick at best. As I said already,
> if you're building a big website, you either scale it horizontally or
> you'll be fired the first time your massive zillion cpu server fails and
> your service goes down.
>
> I should stop talking and just finish rearchitecting apache so you all can
> complain about something else ;)
>
> Dean
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/