Re: [PATCH] bpf: turn off sanitizer in do_misc_fixups for old clang
From: Alexei Starovoitov
Date: Mon Jun 23 2025 - 17:32:57 EST
On Fri, Jun 20, 2025 at 4:38 AM Arnd Bergmann <arnd@xxxxxxxxxx> wrote:
>
> From: Arnd Bergmann <arnd@xxxxxxxx>
>
> clang versions before version 18 manage to badly optimize the bpf
> verifier, with lots of variable spills leading to excessive stack
> usage in addition to likely rather slow code:
>
> kernel/bpf/verifier.c:23936:5: error: stack frame size (2096) exceeds limit (1280) in 'bpf_check' [-Werror,-Wframe-larger-than]
> kernel/bpf/verifier.c:21563:12: error: stack frame size (1984) exceeds limit (1280) in 'do_misc_fixups' [-Werror,-Wframe-larger-than]
>
> Turn off the sanitizer in the two functions that suffer the most from
> this when using one of the affected clang version.
>
> Signed-off-by: Arnd Bergmann <arnd@xxxxxxxx>
> ---
> kernel/bpf/verifier.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 2fa797a6d6a2..7724c7a56d79 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -19810,7 +19810,14 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
> return 0;
> }
>
> -static int do_check(struct bpf_verifier_env *env)
> +#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 180100
> +/* old clang versions cause excessive stack usage here */
> +#define __workaround_kasan __disable_sanitizer_instrumentation
> +#else
> +#define __workaround_kasan
> +#endif
> +
> +static __workaround_kasan int do_check(struct bpf_verifier_env *env)
This looks too hacky for a workaround.
Let's figure out what's causing such excessive stack usage and fix it.
We did some of this work in
commit 6f606ffd6dd7 ("bpf: Move insn_buf[16] to bpf_verifier_env")
and similar.
Looks like it wasn't enough or more stack usage crept in since then.
Also make sure you're using the latest bpf-next.
A bunch of code was moved out of do_check().
So I bet the current bpf-next/master doesn't have a problem
with this particular function.
In my kasan build do_check() is now fully inlined.
do_check_common() is not and it's using 512 bytes of stack.
> {
> bool pop_log = !(env->log.level & BPF_LOG_LEVEL2);
> struct bpf_verifier_state *state = env->cur_state;
> @@ -21817,7 +21824,7 @@ static int add_hidden_subprog(struct bpf_verifier_env *env, struct bpf_insn *pat
> /* Do various post-verification rewrites in a single program pass.
> * These rewrites simplify JIT and interpreter implementations.
> */
> -static int do_misc_fixups(struct bpf_verifier_env *env)
> +static __workaround_kasan int do_misc_fixups(struct bpf_verifier_env *env)
This one is using 832 byte of stack with kasan.
Which is indeed high.
Big chunk seems to be coming from chk_and_sdiv[] and chk_and_smod[].
Yonghong,
looks like you contributed that piece of code.
Pls see how to reduce stack size here.
Daniel used this pattern in earlier commits. Looks like
we took it too far.