RE: [EXT] Re: [v3,net-next 1/4] net: qos: introduce a gate control flow action

From: Po Liu
Date: Thu Apr 23 2020 - 05:15:58 EST


Hi Vlad Buslov,

> > >> > +static enum hrtimer_restart gate_timer_func(struct hrtimer *timer)
> {
> > >> > + struct gate_action *gact = container_of(timer, struct
> gate_action,
> > >> > + hitimer);
> > >> > + struct tcf_gate_params *p = get_gate_param(gact);
> > >> > + struct tcfg_gate_entry *next;
> > >> > + ktime_t close_time, now;
> > >> > +
> > >> > + spin_lock(&gact->entry_lock);
> > >> > +
> > >> > + next = rcu_dereference_protected(gact->next_entry,
> > >> > +
> > >> > + lockdep_is_held(&gact->entry_lock));
> > >> > +
> > >> > + /* cycle start, clear pending bit, clear total octets */
> > >> > + gact->current_gate_status = next->gate_state ?
> > >> GATE_ACT_GATE_OPEN : 0;
> > >> > + gact->current_entry_octets = 0;
> > >> > + gact->current_max_octets = next->maxoctets;
> > >> > +
> > >> > + gact->current_close_time = ktime_add_ns(gact-
> > >current_close_time,
> > >> > + next->interval);
> > >> > +
> > >> > + close_time = gact->current_close_time;
> > >> > +
> > >> > + if (list_is_last(&next->list, &p->entries))
> > >> > + next = list_first_entry(&p->entries,
> > >> > + struct tcfg_gate_entry, list);
> > >> > + else
> > >> > + next = list_next_entry(next, list);
> > >> > +
> > >> > + now = gate_get_time(gact);
> > >> > +
> > >> > + if (ktime_after(now, close_time)) {
> > >> > + ktime_t cycle, base;
> > >> > + u64 n;
> > >> > +
> > >> > + cycle = p->tcfg_cycletime;
> > >> > + base = ns_to_ktime(p->tcfg_basetime);
> > >> > + n = div64_u64(ktime_sub_ns(now, base), cycle);
> > >> > + close_time = ktime_add_ns(base, (n + 1) * cycle);
> > >> > + }
> > >> > +
> > >> > + rcu_assign_pointer(gact->next_entry, next);
> > >> > + spin_unlock(&gact->entry_lock);
> > >>
> > >> I have couple of question about synchronization here:
> > >>
> > >> - Why do you need next_entry to be rcu pointer? It is only assigned
> > >> here with entry_lock protection and in init code before action is
> > >> visible to concurrent users. I don't see any unlocked rcu-protected
> > >> readers here that could benefit from it.
> > >>
> > >> - Why create dedicated entry_lock instead of using already existing
> > >> per- action tcf_lock?
> > >
> > > Will try to use the tcf_lock for verification.

I think I added entry_lock was that I can't get the tc_action common parameter in this timer function. If I insist to use the tcf_lock, I have to move the hrtimer to struct tcf_gate which has tc_action common.
What do you think?

> > > The thoughts came from that the timer period arrived then check
> > > through the list and then update next time would take much more
> time.
> > > Action function would be busy when traffic. So use a separate lock
> > > here for
> > >
> > >>
> > >> > +
> > >> > + hrtimer_set_expires(&gact->hitimer, close_time);
> > >> > +
> > >> > + return HRTIMER_RESTART;
> > >> > +}
> > >> > +
> > >> > +static int tcf_gate_act(struct sk_buff *skb, const struct tc_action *a,
> > >> > + struct tcf_result *res) {
> > >> > + struct tcf_gate *g = to_gate(a);
> > >> > + struct gate_action *gact;
> > >> > + int action;
> > >> > +
> > >> > + tcf_lastuse_update(&g->tcf_tm);
> > >> > + bstats_cpu_update(this_cpu_ptr(g->common.cpu_bstats), skb);
> > >> > +
> > >> > + action = READ_ONCE(g->tcf_action);
> > >> > + rcu_read_lock();
> > >>
> > >> Action fastpath is already rcu read lock protected, you don't need
> > >> to manually obtain it.
> > >
> > > Will be removed.
> > >
> > >>
> > >> > + gact = rcu_dereference_bh(g->actg);
> > >> > + if (unlikely(gact->current_gate_status & GATE_ACT_PENDING))
> > >> > + {
> > >>
> > >> Can't current_gate_status be concurrently modified by timer callback?
> > >> This function doesn't use entry_lock to synchronize with timer.
> > >
> > > Will try tcf_lock either.
> > >
> > >>
> > >> > + rcu_read_unlock();
> > >> > + return action;
> > >> > + }
> > >> > +
> > >> > + if (!(gact->current_gate_status & GATE_ACT_GATE_OPEN))
> > >>
> > >> ...and here
> > >>
> > >> > + goto drop;
> > >> > +
> > >> > + if (gact->current_max_octets >= 0) {
> > >> > + gact->current_entry_octets += qdisc_pkt_len(skb);
> > >> > + if (gact->current_entry_octets >
> > >> > + gact->current_max_octets) {
> > >>
> > >> here also.
> > >>
> > >> > +
> > >> > + qstats_overlimit_inc(this_cpu_ptr(g->common.cpu_qstats));
> > >>
> > >> Please use tcf_action_inc_overlimit_qstats() and other wrappers for
> > stats.
> > >> Otherwise it will crash if user passes
> > TCA_ACT_FLAGS_NO_PERCPU_STATS
> > >> flag.
> > >
> > > The tcf_action_inc_overlimit_qstats() can't show limit counts in tc
> > > show
> > command. Is there anything need to do?
> >
> > What do you mean? Internally tcf_action_inc_overlimit_qstats() just
> > calls qstats_overlimit_inc, if cpu_qstats percpu counter is not NULL:
> >
> >
> > if (likely(a->cpu_qstats)) {
> > qstats_overlimit_inc(this_cpu_ptr(a->cpu_qstats));
> > return;
> > }
> >
> > Is there a subtle bug somewhere in this function?
>
> Sorry, I updated using the tcf_action_*, and the counting is ok. I moved
> back to the qstats_overlimit_inc() because tcf_action_* () include the
> spin_lock(&a->tcfa_lock).
> I would update to tcf_action_* () increate.
>
> >
> > >
> > > Br,
> > > Po Liu
>
> Thanks a lot.
>
> Br,
> Po Liu

Thanks a lot.

Br,
Po Liu