Re: disk head scheduling

Gerard Roudier (groudier@club-internet.fr)
Wed, 24 Mar 1999 21:21:11 +0100 (MET)


On Wed, 24 Mar 1999, Gadi Oxman wrote:

> On Tue, 23 Mar 1999, Gerard Roudier wrote:
>
> > > That's why we already have such a function!
> >
> > That has been the error. I mean to only have this function, since:
> >
> > A) Linux is doing IOs per block or per page from the kernel and we need
> > to coalesce and sort this small virtual IOs somewhere.
> >
> > B) The Linux design is, at the moment, to perform this work from
> > ll_rw_blk.
> >
> > The get_queue() only function looks like some band-aid for IDE to me.
> > Since too partial design is bad design I am not going to use this method.
>
> Ahem. It's fun to redesign and improve and rewrite code over and over,

I never proceeded that way, but it is what I observe everywhere nowadays,
not only under Linux: lots of ideas claimed as being highly better that
the existing stuff and then ... just expected results. ;-)

> and I'm all for it just for the challenge, but we have to remember that
> the Linux block layer I/O subsystem definitely doesn't perform bad in
> practice, so in my opinion this fact alone proves that it is not a bad
> design.

I didn't wrote that it performs bad in practice, but thinking about SCSI
disks that are able to handle more than 32 tags and about resonnable
systems that may use several of them at the same time (I have 7 hard disks
on my system), I think that the 'one queue problem' is a real penalty.

> The get_queue() addition was an evolutional rather than a revolutional
> enhancements. Let me highlight its advantages:
>
> 1. Completely backward compatible. Device drivers aren't forced
> to use it - it automagically works will all the block drivers,
> and at the time we had many relavant block devices; not just
> SCSI and IDE, all the CDROM block drivers as well.

Compatibility is a requirement, doing different would have been a great
stupidity.

> 2. It does allow us to have several queues per major number. It is
> up to the device driver to choose the number of queues which are
> most appropriate for it, and the number of devices which will
> share a queue.

Indeed, it does so.

> 3. The queues are shared between the low level device driver and
> the ll_rw_blk.c queueing layer. This means that while a request
> is being serviced by the low level driver, queued requests are
> *still* visible to the higher layers, and new blocks can still
> be merged into existing requests.

5 milli-seconds are enough to understand that from the code.

> When I added the multiple queues per major to the IDE driver,
> the other approach which I have considered was to use private
> per-drive queues, not shared with ll_rw_blk, and to remove
> queues from the per-major queue into the private per-device
> queue (like the approach used in the SCSI subsystem). This will
> give us part of the advantages, but lacks the advantage of sharing
> outlined above.

You only thought IDE and it what I pointed out, btw.
For SCSI with 1 queue per device, if we are globally plugged we lose
time (think of SMP, and interruptions), and if we discard the plug, we
may break the coalescencing and sorting, and then just provide devices
with bunches small IOs.

> So we had an evolutional change which didn't affect existing drivers,
> was integrated nicely with the current design, and was used to gain
> impressive performance advantages. It wasn't designed to be revolutionary,
> and doesn't prevent us from rewriting the Linux block layer anytime we
> need a new challenge to work on. Was the addition of it to the kernel a
> bad design or band-aid for IDE?

IDE protocol does not allow disconnections and tagged queuing. It only
allows 2 drives per controllers and very short cables. The only way to
really enhance IDE is to go to a SCSI-like protocol (or to switch to
SCSI). Any other effort to enhance this thing can only be kind of
band-aiding, in my opinion, whatever it is clean, clever or just stupid.

Regards,
Gérard.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/