Re: [PATCH 1/5] bpf: Clear callee saved regs after updating REG0

From: David Vernet
Date: Mon Aug 08 2022 - 14:50:29 EST


On Mon, Aug 08, 2022 at 11:14:48AM -0700, Joanne Koong wrote:
> On Mon, Aug 8, 2022 at 8:53 AM David Vernet <void@xxxxxxxxxxxxx> wrote:
> >
> > In the verifier, we currently reset all of the registers containing caller
> > saved args before updating the callee's return register (REG0). In a
> > follow-on patch, we will need to be able to be able to inspect the caller
> > saved registers when updating REG0 to determine if a dynptr that's passed
> > to a helper function was allocated by a helper, or allocated by a program.
> >
> > This patch therefore updates check_helper_call() to clear the caller saved
> > regs after updating REG0.
> >
> Overall lgtm

Thanks for the quick review!

> There's a patch [0] that finds + stores the ref obj id before the
> caller saved regs get reset, which would make this patch not needed.

Interesting. Indeed, that would solve this issue, and I'm fine with that
approach as well, if not preferential to it.

> That hasn't been merged in yet though and I think there's pros for
> either approach.
>
> In the one where we find + store the ref obj id before any caller
> saved regs get reset, the pro is that getting the dynptr metadata (eg
> ref obj id and in the near future, the dynptr type as well) earlier
> will be useful (eg when we add skb/xdp dynptrs [1], we'll need to know
> the type of the dynptr in order to determine whether to set the return
> reg as PTR_TO_PACKET). In this patch, the pro is that the logic is a
> lot more obvious to readers that the ref obj id for the dynptr gets
> found and set in order to store it in the return reg's ref obj id.
>
> I personally lean more towards the approach in [0] because I think
> that ends up being cleaner for future extensibility, but I don't feel
> strongly about it and would be happy going with this approach as well

So, I think regardless of whether this gets merged, [0] is probably worth
merging as I agree that it simplifies the current logic for setting the ref
obj id and is a purely positive change on its own.

When I was originally typing my response to your email, I was wondering
whether it would be useful to keep the state in the caller saved registers
for the logic in 7360 - 7489 in general for the future even if [0] is
applied. It's probably simpler, however, to keep what we have now and just
reset all of the registers. So if [0] gets merged, I can just remove this
patch from the set.

> [0] https://lore.kernel.org/bpf/20220722175807.4038317-1-joannelkoong@xxxxxxxxx/#t
>
> [1] https://lore.kernel.org/bpf/20220726184706.954822-1-joannelkoong@xxxxxxxxx/T/#t
>
> > Signed-off-by: David Vernet <void@xxxxxxxxxxxxx>
> > ---
> > kernel/bpf/verifier.c | 15 ++++++++++-----
> > 1 file changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 096fdac70165..938ba1536249 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -7348,11 +7348,9 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> > if (err)
> > return err;
> >
> > - /* reset caller saved regs */
> > - for (i = 0; i < CALLER_SAVED_REGS; i++) {
> > - mark_reg_not_init(env, regs, caller_saved[i]);
> > - check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
> > - }
> > + /* reset return reg */
> > + mark_reg_not_init(env, regs, BPF_REG_0);
> > + check_reg_arg(env, BPF_REG_0, DST_OP_NO_MARK);
> >
> > /* helper call returns 64-bit value. */
> > regs[BPF_REG_0].subreg_def = DEF_NOT_SUBREG;
> > @@ -7488,6 +7486,13 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> > regs[BPF_REG_0].ref_obj_id = dynptr_id;
> > }
> >
> > + /* reset remaining caller saved regs */
> > + BUILD_BUG_ON(caller_saved[0] != BPF_REG_0);
>
> nit: caller_saved is a read-only const, so I don't think this line is needed

It being a read-only const is was why I made this a BUILD_BUG_ON. My
intention here was to ensure that we're not accidentally skipping the
resetting of caller_saved[0]. The original code iterated from
caller_saved[0] -> caller_saved[CALLER_SAVED_REGS - 1]. Now that we're
starting from caller_saved[1], this compile-time assertion verifies that
we're not accidentally skipping caller_saved[0] by checking that it's the
same as BPF_REG_0, which is reset above. Does that make sense?

> > + for (i = 1; i < CALLER_SAVED_REGS; i++) {
>
> nit: maybe "for i = BPF_REG_1" ?

Good suggestion, will apply in the v2 (if there is one and we don't decide
to just go with [0] :-))

Thanks,
David