Re: [PATCH v5 0/2] bitops: Optimize fns() for improved performance

From: Yury Norov
Date: Thu May 02 2024 - 10:55:55 EST


On Thu, May 02, 2024 at 05:24:41PM +0800, Kuan-Wei Chiu wrote:
> Hello,
>
> This patch series optimizes the fns() function by avoiding repeated
> calls to __ffs(). Additionally, tests for fns() have been added in
> lib/test_bitops.c.

OK, now looks good. Thanks for the work, Kuan-Wei.

I'll take it in bitmap-for-next. Andrew, can you drop the previous
version from -mm?

Thanks,
Yury

>
> Changes in v5:
> - Reduce testing iterations from 1000000 to 10000 to decrease testing
> time.
> - Move 'buf' inside the function.
> - Mark 'buf' as __initdata.
> - Assign the results of fns() to a volatile variable to prevent
> compiler optimization.
> - Remove the iteration count from the benchmark result.
> - Update benchmark results in the commit message.
>
> Changes in v4:
> - Correct get_random_long() -> get_random_bytes() in the commit
> message.
>
> Changes in v3:
> - Move the benchmark test for fns() to lib/test_bitops.c.
> - Exclude the overhead of random number generation from the benchmark
> result.
> - Change the output to print only a total gross instead of each n in
> the benchmark result.
> - Update the commit message in the second patch.
>
> Changes in v2:
> - Add benchmark test for fns() in lib/find_bit_benchmark.c.
> - Change the loop in fns() by counting down from n to 0.
> - Add find_bit benchmark result for find_nth_bit in commit message.
>
> Link to v4: https://lkml.kernel.org/20240501132047.14536-1-visitorckw@xxxxxxxxx
> Link to v3: https://lkml.kernel.org/20240501071647.10228-1-visitorckw@xxxxxxxxx
> Link to v2: https://lkml.kernel.org/20240430054912.124237-1-visitorckw@xxxxxxxxx
> Link to v1: https://lkml.kernel.org/20240426035152.956702-1-visitorckw@xxxxxxxxx
>
> Kuan-Wei Chiu (2):
> lib/test_bitops: Add benchmark test for fns()
> bitops: Optimize fns() for improved performance
>
> include/linux/bitops.h | 12 +++---------
> lib/test_bitops.c | 22 ++++++++++++++++++++++
> 2 files changed, 25 insertions(+), 9 deletions(-)
>
> --
> 2.34.1