Re: CVE-2023-53076: bpf: Adjust insufficient default bpf_jit_limit

From: Shung-Hsi Yu
Date: Mon May 05 2025 - 03:57:53 EST

Next message: Oscar Salvador: "Re: [PATCH v3 1/3] driver/base: Optimize memory block registration to reduce boot time"
Previous message: Marco Crivellari: "Re: [PATCH 0/4] Workqueue: rename system workqueue and add WQ_PERCPU"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, May 02, 2025 at 05:55:41PM +0200, Greg Kroah-Hartman wrote:
> From: Greg Kroah-Hartman <gregkh@xxxxxxxxxx>
>
> Description
> ===========
>
> In the Linux kernel, the following vulnerability has been resolved:
>
> bpf: Adjust insufficient default bpf_jit_limit
>
> We've seen recent AWS EKS (Kubernetes) user reports like the following:
>
> After upgrading EKS nodes from v20230203 to v20230217 on our 1.24 EKS
> clusters after a few days a number of the nodes have containers stuck
> in ContainerCreating state or liveness/readiness probes reporting the
> following error:
>
> Readiness probe errored: rpc error: code = Unknown desc = failed to
> exec in container: failed to start exec "4a11039f730203ffc003b7[...]":
> OCI runtime exec failed: exec failed: unable to start container process:
> unable to init seccomp: error loading seccomp filter into kernel:
> error loading seccomp filter: errno 524: unknown
>
> However, we had not been seeing this issue on previous AMIs and it only
> started to occur on v20230217 (following the upgrade from kernel 5.4 to
> 5.10) with no other changes to the underlying cluster or workloads.
>
> We tried the suggestions from that issue (sysctl net.core.bpf_jit_limit=452534528)
> which helped to immediately allow containers to be created and probes to
> execute but after approximately a day the issue returned and the value
> returned by cat /proc/vmallocinfo | grep bpf_jit | awk '{s+=$2} END {print s}'
> was steadily increasing.
>
> I tested bpf tree to observe bpf_jit_charge_modmem, bpf_jit_uncharge_modmem
> their sizes passed in as well as bpf_jit_current under tcpdump BPF filter,
> seccomp BPF and native (e)BPF programs, and the behavior all looks sane
> and expected, that is nothing "leaking" from an upstream perspective.
>
> The bpf_jit_limit knob was originally added in order to avoid a situation
> where unprivileged applications loading BPF programs (e.g. seccomp BPF
> policies) consuming all the module memory space via BPF JIT such that loading
> of kernel modules would be prevented. The default limit was defined back in
> 2018 and while good enough back then, we are generally seeing far more BPF
> consumers today.
>
> Adjust the limit for the BPF JIT pool from originally 1/4 to now 1/2 of the
> module memory space to better reflect today's needs and avoid more users
> running into potentially hard to debug issues.
>
> The Linux kernel CVE team has assigned CVE-2023-53076 to this issue.

I'd like to dispute this CVE.

The fix here is raising the (sysadmin adjustable) default of
net.core.bpf_jit_limit to 50% of the module memory space from 25%
before. It does not seem to have security implications on the Linux
kernel side.

Shung-Hsi

Next message: Oscar Salvador: "Re: [PATCH v3 1/3] driver/base: Optimize memory block registration to reduce boot time"
Previous message: Marco Crivellari: "Re: [PATCH 0/4] Workqueue: rename system workqueue and add WQ_PERCPU"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]