Re: seqcount usage in xt_replace_table()

From: Florian Westphal
Date: Thu Jan 10 2019 - 09:48:21 EST


Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> /*
> * Ensure contents of newinfo are visible before assigning to
> * private.
> */
> smp_wmb();
> table->private = newinfo;
>
> we have:
>
> smp_store_release(&table->private, newinfo);
>
> But what store does that second smp_wmb() order against? The comment:
>
> /* make sure all cpus see new ->private value */
> smp_wmb();
>
> makes no sense what so ever, no smp_*() barrier can provide such
> guarantees.

IIRC I added this at the request of a reviewer of an earlier iteration
of that patch.

IIRC the concern was that compiler/hw could re-order

smb_wmb();
table->private = newinfo;
/* wait until all cpus are done with old table */

into:

smb_wmb();
/* wait until all cpus are done with old table */
...
table->private = newinfo;

and that (obviously) makes the wait-loop useless.

> > Only alternative I see that might work is synchronize_rcu (the
> > _do_table functions are called with rcu read lock held).
> >
> > I guess current scheme is cheaper though.
>
> Is performance a concern in this path? There is no comment justifying
> this 'creative' stuff.

We have to wait until all cpus are done with current iptables ruleset.

Before this 'creative' change, this relied on get_counters
synchronization. And that caused wait times of 30 seconds or more on
busy systems.

I have no objections swapping this with a synchronize_rcu() if that
makes it easier.

(synchronize_rcu might be confusing though, as we don't use rcu
for table->private), and one 'has to know' that all the netfilter
hooks, including the iptables eval loop, run with rcu_read_lock
held).