Re: The Kommunity vs. Dick Johnson

Richard B. Johnson (root@chaos.analogic.com)
Wed, 18 Nov 1998 13:16:40 -0500 (EST)


On Tue, 17 Nov 1998 yodaiken@chelm.cs.nmt.edu wrote:

> >
> > Date: Mon, 16 Nov 1998 08:17:25 -0500 (EST)
> > From: Isaac Connor <iconnor@penultima.ml.org>
> >
> > I am aware of many instances where gcc was able to produce better code
> > than some asm-advocates I know. It also seems to me, that you have to
> > look at register use, and how that affects code before and after the asm
> > function.
> >
> > This of course proves nothing. The real acid test is whether or not GCC
> > can produce better code than the what the *best* asm-advocates can
> > produce. For example, I've yet to see a version of gcc which can do a
> > good job of compiling the MD5 crypto checksum. The problem is that you
> > have to be really clever to keep all of the MD5 accumulators in
> > registers, and every gcc I've played with fails to do this, and ends up
> > placing at least one or more of the MD5 state variables on the stack.
> > Hence, in general gcc doesn't seem to handle algorithms which puts
> > pressure on the i386's absurdly small register file.
>
> Does that have a measurable cost on a PII? The likelhood that the register
> will be in a shadow register, a write buffer, or in cache seems close to
> 100% if it is used soon again. I'm curious about whether the hardware is
> making register allocation less critical.
>

In fact, the faster the CPU core speed, the less critical the code
generation becomes. Since, if you are dealing with large amounts of
data, you will ultimately be I/O bound with RAM (where the data are), with
fast CPUs there are a lot of cycles that can be wasted.

This fact is being used to essentially get a checksum for free in
the Linux kernel. When copying data from one memory location to another,
if you chose the instruction order properly, you do some summation
during the time that the CPU would otherwise be doing internal NOPs.

In principle, if you had a fast enough CPU, and didn't improve the
RAM speed, i.e., 60-70ns access, you could do a lot of stuff in
BASIC and it would not impact system performance.

Cheers,
Dick Johnson
***** FILE SYSTEM WAS MODIFIED *****
Penguin : Linux version 2.1.128 on an i686 machine (400.59 BogoMips).
Warning : It's hard to remain at the trailing edge of technology.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/