Re: .../asm-i386/bitops.h performance improvements

From: Richard B. Johnson
Date: Wed Jun 15 2005 - 08:08:01 EST



LEA was designed for address calculation on ix86 processors.
If it is used to ready the value of an index register for the
next memory access, it can run in parallel with the next operations.
However, if it is just used to put a value into a register, where
the CPU can't proceed until that value is finalized, it does
nothing more useful than shifts and adds.

In other words, don't substitute LEA for INC or ADD just because
you can.

leal 0x04(%ebx), %ebx
... and
addl $0x04, %ebx

... are functionally the same if the CPU needs the value in ebx
immediately. In the code sequence....

movl (%ebx), %eax
leal 0x04(%ebx), %ebx # Next address
xorl %ecx, %eax
movl %eax, (%ebx)

... the address calculation for the marked next address can proceed
in parallel with the xorl operation that follows. This makes LEA
helpful. However, in the following...

leal (%%eax,%%edi,8),%%eax

... the CPU needs to complete the whole operation before proceeding.
If you measure this, LEA with two index registers, you will find
that the shift and add is faster, guaranteed.

On Wed, 15 Jun 2005, Gene Heskett wrote:

On Wednesday 15 June 2005 04:53, cutaway@xxxxxxxxxxxxx wrote:
In find_first_bit() there exists this the sequence:

shll $3,%%edi
addl %%edi,%%eax

LEA knows how to multiply by small powers of 2 and add all in one
shot very efficiently:

leal (%%eax,%%edi,8),%%eax


In find_first_zero_bit() the sequence:

shll $3,%%edi
addl %%edi,%%edx

could similarly become:

leal (%%edx,%%edi,8),%%edx

To what cpu families does this apply? eg, this may be true for intel,
but what about amd, via etc?


-
To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.35% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


Cheers,
Dick Johnson
Penguin : Linux version 2.6.11.9 on an i686 machine (5537.79 BogoMips).
Notice : All mail here is now cached for review by Dictator Bush.
98.36% of all statistics are fiction.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/