Re: [PATCH bpf-next v3 4/5] bpf: Add support for writing to nf_conn:mark

From: Daniel Xu
Date: Fri Aug 19 2022 - 20:21:35 EST


Hi Kumar,

On Sat, Aug 20, 2022 at 01:46:04AM +0200, Kumar Kartikeya Dwivedi wrote:
> On Sat, 20 Aug 2022 at 01:23, Daniel Xu <dxu@xxxxxxxxx> wrote:
[...]
> > +static int tc_cls_act_btf_struct_access(struct bpf_verifier_log *log,
> > + const struct btf *btf,
> > + const struct btf_type *t, int off,
> > + int size, enum bpf_access_type atype,
> > + u32 *next_btf_id,
> > + enum bpf_type_flag *flag)
> > +{
> > + btf_struct_access_t sa;
> > +
> > + if (atype == BPF_READ)
> > + return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
> > + flag);
> > +
> > + sa = READ_ONCE(nf_conntrack_btf_struct_access);
>
> This looks unsafe. How do you prevent this race?
>
> CPU 0 CPU 1
> sa = READ_ONCE(nf_ct_bsa);
>
> delete_module("nf_conntrack", ..);
>
> WRITE_ONCE(nf_ct_bsa, NULL);
> // finishes
> successfully
> if (sa)
> return sa(...); // oops
>
> i.e. what keeps the module alive while we execute its callback?
>
> Using a mutex is one way (as I suggested previously), either you
> acquire it before unload, or after. If after, you see cb as NULL,
> otherwise if unload is triggered concurrently it waits to acquire the
> mutex held by us. Unsetting the cb would be the first thing the module
> would do.
>
> You can also hold a module reference, but then you must verify it is
> nf_conntrack's BTF before using btf_try_get_module.
> But _something_ needs to be done to prevent the module from going away
> while we execute its code.

I think I somehow convinced myself that nf_conntrack_core.o is always
compiled in. Due to some of the garbage collection semantics I saw in
the code.

Lemme take a closer look (for learning I guess). Mutex is probably
safest bet.

[...]

Thanks,
Daniel