Re: [PATCH v2] Add BPF_SYNCHRONIZE_MAPS bpf(2) command

From: Joel Fernandes
Date: Mon Jul 30 2018 - 22:06:34 EST


On Mon, Jul 30, 2018 at 07:01:22PM -0700, Joel Fernandes wrote:
> On Sun, Jul 29, 2018 at 06:51:18PM +0300, Alexei Starovoitov wrote:
> > On Thu, Jul 26, 2018 at 7:51 PM, Daniel Colascione <dancol@xxxxxxxxxx> wrote:
> > > BPF_SYNCHRONIZE_MAPS waits for the release of any references to a BPF
> > > map made by a BPF program that is running at the time the
> > > BPF_SYNCHRONIZE_MAPS command is issued. The purpose of this command is
> > > to provide a means for userspace to replace a BPF map with another,
> > > newer version, then ensure that no component is still using the "old"
> > > map before manipulating the "old" map in some way.
> > >
> > > Signed-off-by: Daniel Colascione <dancol@xxxxxxxxxx>
> > > ---
> > > include/uapi/linux/bpf.h | 9 +++++++++
> > > kernel/bpf/syscall.c | 13 +++++++++++++
> > > 2 files changed, 22 insertions(+)
> > >
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index b7db3261c62d..5b27e9117d3e 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -75,6 +75,14 @@ struct bpf_lpm_trie_key {
> > > __u8 data[0]; /* Arbitrary size */
> > > };
> > >
> > > +/* BPF_SYNCHRONIZE_MAPS waits for the release of any references to a
> > > + * BPF map made by a BPF program that is running at the time the
> > > + * BPF_SYNCHRONIZE_MAPS command is issued. The purpose of this command
> >
> > that doesn't sound right to me.
> > such command won't wait for the release of the references.
> > in case of map-in-map the program does not hold
> > the references to inner map (only to outer map).
>
> I didn't follow this completely.
>
> The userspace program is using the inner map per your description of the

Sorry just to correct myself, here I meant "The kernel eBPF program is using
the inner map on multiple CPUs" instead of "userspace".

thanks,

- Joel





> algorithm for using map-in-map to solve the race conditions that this patch
> is trying to address:
>
> If you don't mind, I copy-pasted it below from your netdev post:
>
> if you use map-in-map you don't need extra boolean map.
> 0. bpf prog can do
> inner_map = lookup(map_in_map, key=0);
> lookup(inner_map, your_real_key);
> 1. user space writes into map_in_map[0] <- FD of new map
> 2. some cpus are using old inner map and some a new
> 3. user space does sys_membarrier(CMD_GLOBAL) which will do synchronize_sched()
> which in CONFIG_PREEMPT_NONE=y servers is the same as synchronize_rcu()
> which will guarantee that progs finished.
> 4. scan old inner map
>
> In step 2, as you mentioned there are CPUs using different inner maps. So
> could you clarify how the synchronize_rcu mechanism will even work if you're
> now saying "program does not hold references to the inner maps"?
>
> Thanks!
>
> - Joel
>