Re: Recent change in tcp_output.c is surely wrong

From: David S. Miller (davem@redhat.com)
Date: Mon Jan 17 2000 - 16:51:35 EST


   Date: Mon, 17 Jan 2000 16:34:54 -0500
   From: Matthew Wilcox <willy@thepuffingroup.com>

   On Mon, Jan 17, 2000 at 01:18:45PM -0800, David S. Miller wrote:
> > If the intention is to clear bit 31, `&= 0x7fffffff' is the thing which
> > works and is probably more efficient.
>
> Not true on all RISC machines I am familiar with. It's 2 instructions
> either way. On x86 you'll end up using a larger opcode and one of
> x86's most notable performance advantages is it's code density.

   Really? On ARM and PA-RISC, it's 1 instruction (BIC and DEPI,
   respectively). Do SPARC, MIPS and Alpha really not have a `clear bit'
   instruction?

They do, via "andn" on Sparc for example, which takes an immediate
second argument mask which is negated before and'ing it with the first
argument, and this second argument is limited to a signed 13 bit
immediate value. Examples:

So on Sparc there would be two options:

1) sethi %hi(0x80000000), %reg1
        andn %ato, %reg1, %result

2) sll %ato, 1, %reg1
        srl %reg1, 1, %result

On Alpha, loading the high portion of a 32-bit immediate value
sign-extends it, unlike Sparc's 'sethi' above (which zero extends it
on 64-bit Sparc). This means that the shift/shift sequence is the
fastest method. Richard might correct me on this.

If my memory serves, there is no 'andn' instruction on MIPS
so the shift/shift is likely to be the optimal choice here
as well.

Finally, as a result of this, and after some testing to double
check, GCC will not "see" the shift/shift variant as an equivalent
operation when you code it as "&= 0x7fffffff" or "&= ~(1<<31)".
GCC should be able to see this and determine if it is a cheaper
way on a platform to compute the expression, but right now it
does now.

Also, using a temporary immediate value (as in example #1 for Sparc
above) uses one more register than the shift/shift varient, which
could unduly increase register pressure for the procedure. I know
this because I faced this issue in some of the mm/memory.c code
on Sparc64, and I was able to kill several register spills by using
the shift/shift code where applicable in the pgtable.h macros instead
of and-mask.

Later,
David S. Miller
davem@redhat.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jan 23 2000 - 21:00:16 EST