Re: [PATCH v4 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions

From: Vincent MAILHOL
Date: Mon May 23 2022 - 05:23:18 EST


On Thu. 12 May 2022 at 10:20, Vincent Mailhol
<mailhol.vincent@xxxxxxxxxx> wrote:
> The compilers provide some builtin expression equivalent to the ffs(),
> __ffs() and ffz() function of the kernel. The kernel uses optimized
> assembly which produces better code than the builtin
> functions. However, such assembly code can not be optimized when used
> on constant expression.
>
> This series relies on __builtin_constant_p to select the optimal solution:
>
> * use kernel assembly for non constant expressions
>
> * use compiler's __builtin function for constant expressions.
>
> I also think that the fls() and fls64() can be optimized in a similar
> way, using __builtin_ctz() and __builtin_ctzll() but it is a bit less
> trivial so I want to focus on this series first. If it get accepted, I
> will then work on those two additionnal function.
>
>
> ** Statistics **
>
> Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9%
> of __ffs() and ffz() calls (details of the calculation in each patch).
>
>
> ** Changelog **
>
> v3 -> v4:
>
> * (no changes on code, only commit comment was modified)
>
> * Remove note and link to Nick's message in patch 1/2, c.f.:
> https://lore.kernel.org/all/CAKwvOdnnDaiJcV1gr9vV+ya-jWxx7+2KJNTDThyFctVDOgt9zQ@xxxxxxxxxxxxxx/
>
> * Add Reviewed-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx> in tag in patch 2/2.
>
>
> v2 -> v3:
>
> * Redacted out the instructions after ret and before next function
> in the assembly output.
>
> * Added a note and a link to Nick's message on the constant
> propagation missed-optimization in clang:
> https://lore.kernel.org/all/CAKwvOdnH_gYv4qRN9pKY7jNTQK95xNeH1w1KZJJmvCkh8xJLBg@xxxxxxxxxxxxxx/
>
> * Fix copy/paste typo in statistics of patch 1/2. Number of
> occurences before patches are 1081 and not 3607 (percentage
> reduction of 26.7% remains correct)
>
> * Rename the functions as follow:
> - __varible_ffs() -> variable___ffs()
> - __variable_ffz() -> variable_ffz()
>
> * Add Reviewed-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx> in tag in patch 1/2.
>
> Vincent Mailhol (2):
> x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant
> expressions
> x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant
> expressions

Hi Thomas, Ingo and Borislav,

Are there any chances for you to pick those two patches during this
week's merge windows?
https://lore.kernel.org/all/20220512011855.1189653-2-mailhol.vincent@xxxxxxxxxx/
https://lore.kernel.org/all/20220512011855.1189653-3-mailhol.vincent@xxxxxxxxxx/

Thank you!


Yours sincerely,
Vincent Mailhol