Re: [PATCH v2] clocksource: Replace loop within clocks_calc_mult_shift() with clz-based calculation for sftacc

From: John Stultz
Date: Wed Jun 11 2025 - 00:32:14 EST


On Tue, Jun 10, 2025 at 8:30 PM I Hsin Cheng <richard120310@xxxxxxxxx> wrote:
>
> v1 -> v2:
> - Refine commit message to explain more about "why"
> - Check the frequency of "clocks_calc_mult_shift()" get called,
> it's not in hotpath on my machine, refine the commit message
> to avoid overselling it
> - Add comments for the code to explain the implementation in
> more detail
> - Handle case for "tmp == 0" to avoid undefined behavior
> - Experiment using ffs() but it's used for finding the LSB of
> a number, which isn't MSB as we want. It would still need
> looping when intended to use ffs() in this scenario

Oh, apologies for mixing that up and leading you astray!


> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index 2a7802ec480c..59145d762f1e 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -66,10 +66,20 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 maxsec)
> * range:
> */
> tmp = ((u64)maxsec * from) >> 32;
> - while (tmp) {
> - tmp >>=1;
> - sftacc--;
> - }
> +
> + /*
> + * Decrement "sftacc" by the number of bits needed to represent "tmp".
> + * Using (32 - __builtin_clz(tmp)) to ge the bit width:
> + * - __builtin_clz(tmp) returns the count of leading zero bits
> + * - (32 - __builtin_clz(tmp)) gives us the number of significant bits
> + *
> + * Note: It use 32 instead of 31 because we want bit width (0-index MSB
> + * position + 1), not just the MSB position itself.
> + *
> + * Handle tmp == 0 separately since __builtin_clz(0) has undefined behavior.
> + */
> + if (tmp)
> + sftacc -= (32 - __builtin_clz(tmp));

Still think the detail that __builtin_clz only works on the bottom
32bits is good to highlight in the comment.
Though maybe, would explicitly casting tmp to a u32 help it be more clear here?

thanks
-john