Re: [PATCH] Make div64_u64() precise on 32bit platforms
From: Brian Behlendorf
Date: Tue Oct 12 2010 - 15:36:55 EST
I'm resending the patch as is and adding what I hope are the right CCs. Also
let me explain why I opted to add abs64() and use the gcc builtin.
>Can't we just improve abs? Say,
I was reluctant to change abs() since it would have a much larger impact on
the code base. Using typeof() should be OK but if any of the callers
mistakenly call abs() with an unsigned value then we could see compiler
warnings about '__x < 0' being a useless conditional.
>This is a bit unusual. I mean, it is not that common to use gcc builtins
>in the normal code. And, it seems, we can use __fls(divisor >> 32) or
>just fls64() instead ?
I opted for the gcc builtin because I felt it made the code more readable. I
also suspect it will perform slightly better than __fls() on some archs. For
example, on powerpc __fls() in implemented using the 'cntlzw' instruction.
It returns (BITS_PER_LONG - 1 - cntlzw) which is wasted work since my
function would immediately undo this to get back cntlzw. If I was lucky the
compiler would optimize this away for me but if I use the builtin I don't
need to take the chance.
--
Thanks,
Brian Behlendorf