Re: More on 2.2.18pre2aa2 (summary of elevator ideas)

From: Jeff V. Merkey (jmerkey@timpanogas.com)
Date: Tue Sep 12 2000 - 18:13:50 EST


One important point on remirroring I did not mention in my post. In
NetWare, remirroring scans the disk BACKWARDS (n....0) to prevent
artificial starvation while remirring is going on. This was another
optimization we learned the hard way by trying numerous approaches to
the problem.

Jeff

Ed Tomlinson wrote:
>
> Hi,
>
> Geez, A simple comment on IRC can _really_ generate lots of feedback.
> (There were over 50 messages about this in my queue - did not help
> that some were duplicated three times <grin>).
>
> I made the comment because I remember back when the discussion was current
> on linux kernel. I thought Jeff Merkey's, message was to the point. Para-
> phrasing from memory, it was something to the effect that novell had
> tried many elevators. All had problems with some loads. The best they
> had found was the 'close the door' idea. I do not remember if the door
> was based on requests or time. Another point to remember is that the
> netware people came up with a what they considered a good solution. From
> Jeff's comment they arrived at this solution by experiments and bitter
> experience. Maybe we can learn something from their research?
>
> Here is what I glean from the thread:
>
> >From all the discussion I find this suggestion from Alan to make lots of
> sense. Think it can be made to work with number of request almost as easily
> as with time...
>
> >When you do the scan to decide where to insert the entry you dont consider
> >insertion before the time. Also you keep two queue heads the real and the
> >insert head. Whenever you walk from the insert head and find it points to
> >a 'too old' entry you update the insert_head.
>
> And this suggestion from Rik should counter most of Andrea's time vs requests
> vs slow block devices issues. We just have to be sure to close the door after
> atleast n request or m time. However, as pointed out by Chris Evan, later we
> may not have to do this - there are stats that can give a good idea of a
> device's latency.
>
> >Not really. What about just using something like
> >"half a second, but at least 10 requests liberty to
> >reorder" ?
>
> >It's simple, should be good enough for the user and
> >allows for a little bit of reordering even on very
> >slow devices ...
>
> As Andrea points out its easy enought to do some sort of test with
> the current code.
>
> >Well changing that is very easy, you only have to change the unit of
> >measure w/o changing one bit in the algorithm that is just implemented
> >indeed.
>
> >How
>
> >Just assume the req->elevator_sequence to be calculate in jiffy and in
> >elevator.c change the check for `!req->elevator_sequence' to
> >`time_before(req->elevator_sequence, jiffies)'. Then of course change the
> >initialization of req->elevator_sequence to be done with `jiffies +
> >elevator->read_latency'. Then also elvtune will talk in jiffies and not in
> >requests.
>
> I wonder if using a wandering insert pointer, as Alan suggests, would give
> lower overhead than the current implementation (and would it really help)?
>
> Again from Alan,
>
> Andrea> Now, I see people trying to introduce the concept of elapsed time
> Andrea> that fix, which smells strongly of hack. How will this hack be cobbled
> >
> > Actually my brain says that elapsed time based scheduling is the right
> > thing to do. It certainly works for networks
>
> And from Chris Evans,
>
> >Interesting, I'll try and run with this. The mention of networks reminds
> >me that any "max service time" variable is a tunable quantity depending on
> >current conditions..
>
> >.. and sct's block device I/O accounting patches give us the current
> >average request service time on a per-device basis. Multiply that up a bit
> >and maybe you have your threshold for moving things to the head of the
> >queue.
>
> So we could end up using a figure that builds in both number of request and
> time on a per device level without to much effort...
>
> And Alan sums up the whole thing nicly with:
>
> Andrea> Why do you say it's not been fixed? Can you still reproduce hangs long
> Andrea> a write(2) can write? I certainly can't.
>
> >I cant reproduce long hangs. Im not seeing as good I/O throughput as before
> >but right now Im quite happy with the tradeoff. If someone can make it better
> >then Im happier still
>
> If the idea works, lead to simpler code, a more reponsive system maybe with
> better benchmarks then its a winner. Only way we can be sure is to try it.
>
> Thanks,
>
> Ed Tomlinson <tomlins@cam.org> (ontadata on IRC)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Sep 15 2000 - 21:00:19 EST