Thanks for your explanation, I missed the latter.This iter can be used in all ctx-s which is nice, but let's
make the verifier enforce rcu_read_lock/unlock done by bpf prog
instead of doing in the ctor/dtor of iter, since
in sleepable progs the verifier won't recognize that body is RCU CS.
We'd need to teach the verifier to allow bpf_iter_process_new()
inside in_rcu_cs() and make sure there is no rcu_read_unlock
while BPF_ITER_STATE_ACTIVE.
bpf_iter_process_destroy() would become a nop.
Thanks for your review!
I think bpf_iter_process_{new, next, destroy} should be protected by
bpf_rcu_read_lock/unlock explicitly whether the prog is sleepable or
not, right?
Correct. By explicit bpf_rcu_read_lock() in case of sleepable progs
or just by using them in normal bpf progs that have implicit rcu_read_lock()
done before calling into them.
Maybe it's something looks like the following?
I'm not very familiar with the BPF verifier, but I believe
there is still a risk in directly calling these kfuns even if
in_rcu_cs() is true.
Maby what we actually need here is to enforce BPF verifier to check
env->cur_state->active_rcu_lock is true when we want to call these kfuncs.
active_rcu_lock means explicit bpf_rcu_read_lock.
Currently we do allow bpf_rcu_read_lock in non-sleepable, but it's pointless.
Technically we can extend the check:
if (in_rbtree_lock_required_cb(env) && (rcu_lock ||
rcu_unlock)) {
verbose(env, "Calling
bpf_rcu_read_{lock,unlock} in unnecessary rbtree callback\n");
return -EACCES;
}
to discourage their use in all non-sleepable, but it will break some progs.
I think it's ok to check in_rcu_cs() to allow bpf_iter_process_*().
If bpf prog adds explicit and unnecessary bpf_rcu_read_lock() around
the iter ops it won't do any harm.
Just need to make sure that rcu unlock logic:
} else if (rcu_unlock) {
bpf_for_each_reg_in_vstate(env->cur_state,
state, reg, ({
if (reg->type & MEM_RCU) {
reg->type &= ~(MEM_RCU |
PTR_MAYBE_NULL);
reg->type |= PTR_UNTRUSTED;
}
}));
clears iter state that depends on rcu.
I thought about changing mark_stack_slots_iter() to do
st->type = PTR_TO_STACK | MEM_RCU;
so that the above clearing logic kicks in,
but it might be better to have something iter specific.
is_iter_reg_valid_init() should probably be changed to
make sure reg->type is not UNTRUSTED.
Andrii,
do you have better suggestions?