Re: [PATCH bpf-next] bpf: Fix RCU usage in bpf_get_cgroup_classid_curr helper

From: Charalampos Mitrodimas
Date: Tue Jun 10 2025 - 11:33:05 EST


Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes:

> On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>>
>> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
>> > On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
>> > <charmitro@xxxxxxxxxx> wrote:
>> >>
>> >> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>> >> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>> >> program types. This helper used __task_get_classid() which calls
>> >> task_cls_state() that requires rcu_read_lock_bh_held().
>> >>
>> >> This triggers an RCU warning when called from BPF syscall programs
>> >> which run under rcu_read_lock_trace():
>> >>
>> >> WARNING: suspicious RCU usage
>> >> 6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>> >> -----------------------------
>> >> net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>> >>
>> >> Fix this by replacing __task_get_classid() with task_cls_classid()
>> >> which handles RCU locking internally using regular rcu_read_lock() and
>> >> is safe to call from any context.
>> >>
>> >> Reported-by: syzbot+b4169a1cfb945d2ed0ec@xxxxxxxxxxxxxxxxxxxxxxxxx
>> >> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>> >> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>> >> Signed-off-by: Charalampos Mitrodimas <charmitro@xxxxxxxxxx>
>> >> ---
>> >> net/core/filter.c | 2 +-
>> >> 1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/net/core/filter.c b/net/core/filter.c
>> >> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
>> >> --- a/net/core/filter.c
>> >> +++ b/net/core/filter.c
>> >> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>> >> #ifdef CONFIG_CGROUP_NET_CLASSID
>> >> BPF_CALL_0(bpf_get_cgroup_classid_curr)
>> >> {
>> >> - return __task_get_classid(current);
>> >> + return task_cls_classid(current);
>> >> }
>> >
>> > Daniel added this helper in
>> > commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
>> > with intention to use it from networking hooks.
>> >
>> > But task_cls_classid() has
>> > if (in_interrupt())
>> > return 0;
>> >
>> > which will trigger in softirq and tc hooks.
>> > So this might break Daniel's use case.
>>
>> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
>> a new helper implementation for the more generic, non-networking case which
>> then internally uses task_cls_classid().
>
> Instead of forking the helper I think we can :
> rcu_read_lock_bh_held() || rcu_read_lock_held()
> in task_cls_state().

I tested your suggestion with,

rcu_read_lock_bh_held() || rcu_read_lock_held()

but it still triggers the RCU warning because BPF syscall programs use
rcu_read_lock_trace().

Adding rcu_read_lock_trace_held() fixes it functionally but triggers a
checkpatch warning:

WARNING: use of RCU tasks trace is incorrect outside BPF or core RCU code

I think the best solution here would be to add
local_bh_disable()/enable() protection directly in the BPF helper. This
keeps the fix localized to where the problem exists, and avoids
modifying core cgroup RCU.

> And that will do it.