[OFF-TOPIC] Re: Absolutely horrid IDE performance...

Paul Jakma (paul@clubi.ie)
Sat, 5 Dec 1998 20:11:54 +0000 (GMT)


On Sat, 5 Dec 1998, Mark Lord wrote:

> Gerard Roudier wrote:
>
> > And if your pair of drives together are really able of 32 MB sustained
> > data rate, why is the block read result so less than this number (23
> > MB/s).
>
> Because the drives can perform writes asynchronously to the CPU,
> with internal write-gathering.

But bonnie first does a per character read performance test before the
block read test. So the on-drive write cache should be guaranteed to be
flushed (even if it isn't, the cache is so small compared to the file
size of a typical bonnie test run...).

The way I've heard it, the 33MB/s maximum of UATA refers *literally* to
the data transfer rate. ie only the bursting, pure data transfer part of
a full read UATA bus cycle. The actual maximum transfer rate of UATA is
(i've heard) ~24MB/s.

(it's not really a con. SCSI transfer rates are quoted the same way,
cause all bus arbitration on scsi is done asynchronously).

> Remember, these drives are exactly
> the same mechanisms as on the "SCSI" versions of the same models,
> and are thus capable of exactly the same internal performance.
>
> All that is different is the external connector and protocol.
>
> They might even be faster if we implemented tagged-queuing
> for IDE (new drives now support this for ATA as well as SCSI).
>

are you sure of this? As i understand it the main speed-up of Ultra ATA
over EIDE is that data transfer is done in a "burst" mode, with data
transfered on the falling edge of a clock tick, as well as on the
leading edge. I havn't come across any references to command queueing on
ATA, and i don't believe ATA even has the concept of multiple commands.

Command queueing also implies other necessary supporting features, like
disconnection, so that you would end up nearly with SCSI, so it seems
unlikely.

I agree with you that some cross interface drives (eg quantum fireball),
are pretty much the same in both IDE and SCSI version in terms of
hardware, but in terms of silicon IMO they're completely different. Have
a look at the PCB's on the different drives, the IDE one is usually
tiny, or else not very densely packed, and the SCSI drive nearly always
has a PCB covering the entire bottom of the drive. This would suggest
that it's quite possible that the caching and internal disk access
ordering optimization logic is different between SCSI and IDE versions
aswell.

>
> > 3) CPU load is nicely low for the system used for this benchmark.
>
> Hard to tell about that one, since all we have are percentage numbers.
> To measure CPU load, one needs measures of I/O related execution time,
> not percentages.
>
> To me, a low CPU percentage means that the I/O subsystem is slow enough
> that the CPU spends most of its time waiting for data. Not good.
>

yep, cpu percentage can be fairly bogus. How do you tell whether that
CPU percentage is the kernel blocked waiting for some I/O to complete,
(eg a slow drive on a slow interface), or that the kernel is actually
busy doing work, (eg a fast drive on a fast efficient interface)?

need to go far more in-depth than that, eg /proc/io_trace on newer
kernels. There's an associated programme to select exactly what you
want to trace and make the output human readable (which I've lost :).

> If we had an infinitely fast drive, then CPU percentage would always
> be around 100% -- no waiting. So the measurement is not useful on an
> absolute scale, though could have meaning when comparing systems with
> identical motherboard/cpu/memory to one another.
>

Also, CPU percentage can differ depending on how the rest of the system
is loaded. If the current process has to block in kernel awaiting
pending IO, and if there are absolutely no other processes running, then
AFAIK that process will have 100% cpu time. if there are plenty of other
processes that can be run, then the blocked process will have a far
lower % CPU time.

Try some bonnie runs in single user mode and then on a fully-up system
running a couple of CPU bound processes, (eg rc5des). Do it for the
following cases:

single disk IDE / SCSI
multiple disk IDE / SCSI (run bonnies on each disk simultaniously)

and see what you get.

regards,

-- 
Paul Jakma	paul@clubi.ie
**********************************************************
/etc/crontab:
01 5 * * * root find / -name windows -type d \ 
	-fstype msdos -o -fstype vfat -exec rm -rf {} \;
**********************************************************
PGP5 public key: http://www.clubi.ie/jakma/publickey.txt

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/