Re: [PATCH] powerpc/32: Remove one insn in __bswapdi2

From: Segher Boessenkool
Date: Thu Aug 11 2016 - 18:12:13 EST


On Thu, Aug 11, 2016 at 11:34:37PM +0200, Gabriel Paubert wrote:
> On the other hand gcc did at the time a very poor job (quite an
> understatement) at bswapdi when compiling for 64 bit processors
> (see the example).
>
> But what do modern compilers generate for bswapdi these days? Do they
> still call the library or not?

Nope.

> After all, bswapdi on 32 bit processors only takes 6 instructions if the
> input and output registers don't overlap.

For this testcase:
===
typedef unsigned long long u64;
u64 bs(u64 x) { return __builtin_bswap64(x); }
===

we get with -m32:
===
bs:
mr 9,3
rotlwi 3,4,24
rlwimi 3,4,8,8,15
rlwimi 3,4,8,24,31
rotlwi 4,9,24
rlwimi 4,9,8,8,15
rlwimi 4,9,8,24,31
blr
===

and with -m64:
===
.L.bs:
srdi 10,3,32
mr 9,3
rotlwi 3,3,24
rotlwi 8,10,24
rlwimi 3,9,8,8,15
rlwimi 8,10,8,8,15
rlwimi 3,9,8,24,31
rlwimi 8,10,8,24,31
sldi 3,3,32
or 3,3,8
blr
===

Neither as tight as possible, but neither horrible either.


Segher