Re: [PATCH] Remove some divide instructions

From: Linus Torvalds
Date: Thu Oct 28 2004 - 23:54:35 EST




On Thu, 28 Oct 2004, Zachary Amsden wrote:
>
> This leaves several options:
>
> 1) Forget the optimization altogether
> 2) Go back to the (base == 1) check

Ok, I think I led you on a merry goose-chase, and the "base == 1" check
was the only one worth bothering with after all. Sorry about that.

> This seems like a lot of work for a trivial optimization; for i386,
> perhaps #2 is the most appropriate - with a sufficiently new GCC, this
> optimization should be automatic for all architectures not hardcoding
> do_div as inline assembler.

The do_div() optimization is a trivial one, and one that gcc should
definitely have recognized. It's not like a compiler cannot see that you
have a 64 / 32 divide, and realize that it's cheaper than a full 64 / 64
divide.

But hey, even if the gcc people finally did that optimization today, it
would take a few years before we didn't support old compilers any more,
so.

> Seems to have come full circle - the trivial extension turns out to have
> non-trivial side effects. If only GCC were as easily extensible as
> sparse! A __builtin_highest_one_bit() function would make it possible
> to use inline assembler without degenerating to individual cases for
> each bit.

Yes. There are tons of places where we'd love to have a constant
compile-time "log2()" function. And yes, I could do it in sparse in about
ten more lines of code, but..

I guess you could do it with a lot of tests... Something like this should
do it

#define __constant_log2(y) ((y==1) ? 0 : \
(y==2) ? 1 : \
(y==4) ? 2 : \
(y==8) ? 3 : -1) /* We could go on ... */

#define do_div(x,y) ({ \
unsigned long __mod; \
int __log2; \
if (__builtin_constant_p(y) && \
!((y) & ((y)-1)) && \
(__log2 = __constant_log2((y))) >= 0) { \
mod = x & ((y)-1); \
(x) >>= __log2; \
} else { \
.. inline asm case .. \
} \
__mod; })

which looks like it should work, but it's getting so ugly that I suspect I
should be committed for even thinking about it.

(And no, I didn't test the above. It is all trivially optimizable by a
compiler, and I can't see how gcc could _fail_ to get it right, but hey, I
thought the previous thing would work too, so I'm clearly not competent to
make that judgement... ;)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/