Re: [PATCH v4 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

From: Adrian Ratiu
Date: Wed Jan 20 2021 - 09:15:07 EST


On Tue, 19 Jan 2021, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
On Tue, Jan 19, 2021 at 5:17 AM Adrian Ratiu <adrian.ratiu@xxxxxxxxxxxxx> wrote:

From: Nathan Chancellor <natechancellor@xxxxxxxxx>
Drop warning because kernel now requires GCC >= v4.9 after commit 6ec4476ac825 ("Raise gcc version requirement to 4.9") and clarify that -ftree-vectorize now always needs enabling for GCC by directly testing the presence of CONFIG_CC_IS_GCC.
Another reason to remove the warning is that Clang exposes itself as GCC < 4.6 so it triggers the warning about GCC which doesn't make much sense and misleads Clang users by telling them to update GCC.
Because Clang is now supported by the kernel print a clear Clang-specific warning.
Link: https://github.com/ClangBuiltLinux/linux/issues/496 Link: https://github.com/ClangBuiltLinux/linux/issues/503 Reported-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx> Reviewed-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>

This is not the version of the patch I had reviewed; please drop my reviewed-by tag when you change a patch significantly, as otherwise it looks like I approved this patch.
Nacked-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>

Sorry for not removing the reviewed-by tags from the previous versions in this v4. I guess the only way forward with this is to actually make clang vectorization work. Also thanks for the patch suggestion in the other e-mail!

Signed-off-by: Nathan Chancellor <natechancellor@xxxxxxxxx>
Signed-off-by: Adrian Ratiu <adrian.ratiu@xxxxxxxxxxxxx>
---
arch/arm/lib/xor-neon.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index b99dd8e1c93f..f9f3601cc2d1 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,20 +14,22 @@ MODULE_LICENSE("GPL");
#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
#endif

+/*
+ * TODO: Even though -ftree-vectorize is enabled by default in Clang, the
+ * compiler does not produce vectorized code due to its cost model.
+ * See: https://github.com/ClangBuiltLinux/linux/issues/503
+ */
+#ifdef CONFIG_CC_IS_CLANG
+#warning Clang does not vectorize code in this file.
+#endif

Arnd, remind me again why it's a bug that the compiler's cost model
says it's faster to not produce a vectorized version of these loops?
I stand by my previous comment: https://bugs.llvm.org/show_bug.cgi?id=40976#c8

+
/*
* Pull in the reference implementations while instructing GCC (through
* -ftree-vectorize) to attempt to exploit implicit parallelism and emit
* NEON instructions.
*/
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
#pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
#endif

#pragma GCC diagnostic ignored "-Wunused-variable"
--
2.30.0



--
Thanks,
~Nick Desaulniers