Re: [PATCH bpf-next v2] net: Fix RCU usage in task_cls_state() for BPF programs

From: Charalampos Mitrodimas
Date: Wed Jun 11 2025 - 13:08:41 EST


Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes:

> On Wed, Jun 11, 2025 at 2:04 AM Charalampos Mitrodimas
> <charmitro@xxxxxxxxxx> wrote:
>>
>> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>> program types, not just networking programs.
>>
>> This helper calls __task_get_classid() which internally calls
>> task_cls_state() requiring rcu_read_lock_bh_held(). This works in
>> networking/tc context where RCU BH is held, but triggers an RCU
>> warning when called from other contexts like BPF syscall programs that
>> run under rcu_read_lock_trace():
>>
>> WARNING: suspicious RCU usage
>> 6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>> -----------------------------
>> net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>>
>> Fix this by also accepting rcu_read_lock_trace_held() as a valid RCU
>> context in the task_cls_state() function. This is safe because BPF
>> programs are non-sleepable and task_cls_state() is only doing an RCU
>> dereference to get the classid.
>>
>> Reported-by: syzbot+b4169a1cfb945d2ed0ec@xxxxxxxxxxxxxxxxxxxxxxxxx
>> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>> Signed-off-by: Charalampos Mitrodimas <charmitro@xxxxxxxxxx>
>> ---
>> Changes in v2:
>> - Fix RCU usage in task_cls_state() instead of BPF helper
>> - Add rcu_read_lock_trace_held() check to accept trace RCU as valdi
>> context
>> - Drop the approach of using task_cls_classid() which has in_interrupt()
>> check
>> - Link to v1: https://lore.kernel.org/r/20250608-rcu-fix-task_cls_state-v1-1-2a2025b4603b@xxxxxxxxxx
>> ---
>> net/core/netclassid_cgroup.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
>> index d22f0919821e931fbdedf5a8a7a2998d59d73978..df86f82d747ac40e99597d6f2d921e8cc2834e64 100644
>> --- a/net/core/netclassid_cgroup.c
>> +++ b/net/core/netclassid_cgroup.c
>> @@ -21,7 +21,8 @@ static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state
>> struct cgroup_cls_state *task_cls_state(struct task_struct *p)
>> {
>> return css_cls_state(task_css_check(p, net_cls_cgrp_id,
>> - rcu_read_lock_bh_held()));
>> + rcu_read_lock_bh_held() ||
>> + rcu_read_lock_trace_held()));
>
> This is incomplete. It only addresses one particular syzbot report.
> It needs to include rcu_read_lock_held() as well.

To which other report you are refering to?

>
> pw-bot: cr