Re: [RFC][PATCH] Faster generic_fls

From: Linus Torvalds (torvalds@transmeta.com)
Date: Wed Apr 30 2003 - 09:11:47 EST


On 30 Apr 2003, Falk Hueffner wrote:
>
> gcc 3.4 will have a __builtin_ctz function which can be used for this.
> It will emit special instructions on CPUs that support it (i386, Alpha
> EV67), and use a lookup table on others, which is very boring, but
> also faster.

Classic mistake. Lookup tables are only faster in benchmarks, they are
almost always slower in real life. You only need to miss in the cache
_once_ on the lookup to lose all the time you won on the previous one
hundred calls.

"Small and simple" is almost always better than the alternatives. I
suspect that's one reason why older versions of gcc often generate code
that actually runs faster than newer versions: the newer versions _look_
like they do a better job, but..

                        Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Apr 30 2003 - 22:00:35 EST