Re: [RFC PATCH bpf-next 2/4] bpf: Introduce process open coded iterator kfuncs

From: Chuyi Zhou
Date: Thu Sep 07 2023 - 12:39:06 EST


Hello,
在 2023/9/7 01:17, Alexei Starovoitov 写道:
[...cut...]
This iter can be used in all ctx-s which is nice, but let's
make the verifier enforce rcu_read_lock/unlock done by bpf prog
instead of doing in the ctor/dtor of iter, since
in sleepable progs the verifier won't recognize that body is RCU CS.
We'd need to teach the verifier to allow bpf_iter_process_new()
inside in_rcu_cs() and make sure there is no rcu_read_unlock
while BPF_ITER_STATE_ACTIVE.
bpf_iter_process_destroy() would become a nop.

Thanks for your review!

I think bpf_iter_process_{new, next, destroy} should be protected by
bpf_rcu_read_lock/unlock explicitly whether the prog is sleepable or
not, right?

Correct. By explicit bpf_rcu_read_lock() in case of sleepable progs
or just by using them in normal bpf progs that have implicit rcu_read_lock()
done before calling into them.
Thanks for your explanation, I missed the latter.

I'm not very familiar with the BPF verifier, but I believe
there is still a risk in directly calling these kfuns even if
in_rcu_cs() is true.

Maby what we actually need here is to enforce BPF verifier to check
env->cur_state->active_rcu_lock is true when we want to call these kfuncs.

active_rcu_lock means explicit bpf_rcu_read_lock.
Currently we do allow bpf_rcu_read_lock in non-sleepable, but it's pointless.

Technically we can extend the check:
if (in_rbtree_lock_required_cb(env) && (rcu_lock ||
rcu_unlock)) {
verbose(env, "Calling
bpf_rcu_read_{lock,unlock} in unnecessary rbtree callback\n");
return -EACCES;
}
to discourage their use in all non-sleepable, but it will break some progs.

I think it's ok to check in_rcu_cs() to allow bpf_iter_process_*().
If bpf prog adds explicit and unnecessary bpf_rcu_read_lock() around
the iter ops it won't do any harm.
Just need to make sure that rcu unlock logic:
} else if (rcu_unlock) {
bpf_for_each_reg_in_vstate(env->cur_state,
state, reg, ({
if (reg->type & MEM_RCU) {
reg->type &= ~(MEM_RCU |
PTR_MAYBE_NULL);
reg->type |= PTR_UNTRUSTED;
}
}));
clears iter state that depends on rcu.

I thought about changing mark_stack_slots_iter() to do
st->type = PTR_TO_STACK | MEM_RCU;
so that the above clearing logic kicks in,
but it might be better to have something iter specific.
is_iter_reg_valid_init() should probably be changed to
make sure reg->type is not UNTRUSTED.

Maybe it's something looks like the following?

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index bb78212fa5b2..9185c4a40a21 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1172,7 +1172,15 @@ static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg

static void __mark_reg_known_zero(struct bpf_reg_state *reg);

+static bool in_rcu_cs(struct bpf_verifier_env *env);
+
+/* check whether we are using bpf_iter_process_*() or bpf_iter_css_*() */
+static bool is_iter_need_rcu(struct bpf_kfunc_call_arg_meta *meta)
+{
+
+}
static int mark_stack_slots_iter(struct bpf_verifier_env *env,
+ struct bpf_kfunc_call_arg_meta *meta,
struct bpf_reg_state *reg, int insn_idx,
struct btf *btf, u32 btf_id, int nr_slots)
{
@@ -1193,6 +1201,12 @@ static int mark_stack_slots_iter(struct bpf_verifier_env *env,

__mark_reg_known_zero(st);
st->type = PTR_TO_STACK; /* we don't have dedicated reg type */
+ if (is_iter_need_rcu(meta)) {
+ if (in_rcu_cs(env))
+ st->type |= MEM_RCU;
+ else
+ st->type |= PTR_UNTRUSTED;
+ }
st->live |= REG_LIVE_WRITTEN;
st->ref_obj_id = i == 0 ? id : 0;
st->iter.btf = btf;
@@ -1281,6 +1295,8 @@ static bool is_iter_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_
struct bpf_stack_state *slot = &state->stack[spi - i];
struct bpf_reg_state *st = &slot->spilled_ptr;

+ if (st->type & PTR_UNTRUSTED)
+ return false;
/* only main (first) slot has ref_obj_id set */
if (i == 0 && !st->ref_obj_id)
return false;

Andrii,
do you have better suggestions?