Re: [net-next Patch v2 4/5] octeontx2-pf: Add devlink support to configure TL1 RR_PRIO

From: Hariprasad Kelam
Date: Thu Jan 26 2023 - 12:17:09 EST




> On Tue, Jan 24, 2023 at 11:49:48AM +0000, Hariprasad Kelam wrote:
> >
> >
> > > On Mon, Jan 23, 2023 at 02:45:48PM -0800, Jakub Kicinski wrote:
> > > > On Mon, 23 Jan 2023 22:05:58 +0200 Maxim Mikityanskiy wrote:
> > > > > OK, I seem to get it now, thanks for the explanation!
> > > > >
> > > > > How do you set the priority for HTB, though? You mentioned this
> > > > > command to set priority of unclassified traffic:
> > > > >
> > > > > devlink -p dev param set pci/0002:04:00.0 name tl1_rr_prio value
> > > > > 6 \ cmode runtime
> > > > >
> > > > > But what is the command to change priority for HTB?
> > > > >
> > > > > What bothers me about using devlink to configure HTB priority is:
> > > > >
> > > > > 1. Software HTB implementation doesn't have this functionality,
> > > > > and it always prioritizes unclassified traffic. As far as I
> > > > > understand, the rule for tc stuff is that all features must have
> > > > > a reference implementation in software.
> > > > >
> > > > > 2. Adding a flag (prefer unclassified vs prefer classified) to
> > > > > HTB itself may be not straightforward, because your devlink
> > > > > command has a second purpose of setting priorities between
> > > > > PFs/VFs, and it may conflict with the HTB flag.
> > > >
> > > > If there is a two-stage hierarchy the lower level should be
> > > > controlled by devlink-rate, no?
> > >
> > > From the last picture by Hariprasad, I understood that the user sets
> > > all priorities (for unclassified traffic per PF and VF, and for HTB
> > > traffic) on the same TL2 level, i.e. it's not two-stage. (Maybe I
> > > got it all wrong again?)
> > >
> > > I asked about the command to change the HTB priority, cause the
> > > parameters aren't easily guessed, but I assume it's also devlink (i.e.
> > > driver-specific).
> > >
> > Currently, we don't support changing HTB priority since TC_HTB_MODIFY is
> not yet supported.
> > The driver implementation is inline with htb tc framework, below are
> > commands we use for setting htb priority
> >
> > ethtool -K eth0 hw-tc-offload on
> > tc qdisc replace dev eth0 root handle 1: htb offload tc class add dev
> > eth0 parent 1: classid 1:1 htb rate 10Gbit prio 2 tc class add dev
> > eth0 parent 1: classid 1:1 htb rate 10Gbit prio 3
>
> I thought there was a concept of a priority of the whole HTB tree in your
> implementation...
>
> So, if I run these commands:
>
> devlink -p dev param set pci/0002:04:00.0 name tl1_rr_prio value 2 cmode
> runtime tc class add dev eth0 parent 1: classid 1:1 htb rate 10Gbit prio 1 tc
> class add dev eth0 parent 1: classid 1:2 htb rate 10Gbit prio 3
>
> Will it prioritize class 1:1 over unclassified traffic, and unclassified traffic over
> class 1:2?
>
> > > If there were two levels (the upper level chooses who goes first:
> > > HTB or unclassified, and the lower level sets priorities per PF and
> > > VF for unclassified traffic), that would be more straightforward to
> > > solve: the upper level should be controlled by a new HTB parameter,
> > > and the lower level is device-specific, thus it goes to devlink.
> >
> > The PF netdev and VFs share the same physical link and share the same TL1
> node.
> > Hardware supports one DWRR group and the rest are strict priority
> > nodes. Driver configures the priority set by devlink to PF and VF
> > traffic TL2 nodes such that traffic is forwarded to TL1 using DWRR algorithm.
> >
> > Now that if we add htb command for unclassified traffic, at any given
> > point in time HTB rule only applies to a single interface, since we
> > require to set DWRR priority at TL1, which applies to both PF/VFs, we feel
> it's a different use case altogether.
>
> Thanks, with the example above your explanation makes sense to me now.
>
> So, basically, in your implementation, entities prioritized by hardware
> are: each HTB class, each VF and PF; you want to keep user-assigned
> priorities for HTB classes, you want to let the user assign a priority for
> unclassified traffic, but the latter must be equal for all VFs and PF (for DWRR
> to work), correct? And that devlink command is only useful in the HTB
> scenario, i.e. it doesn't matter what tl1_rr_prio you set if HTB is not used,
> right?
>
> What I don't like in the current implementation is that it adds a feature to
> HTB, bypassing HTB (also not providing a software equivalent of the feature).
> I would expect the priority of unclassified traffic to be configured with tc
> commands that set up HTB (by the way, HTB has a "default" option to choose
> a class for unclassified traffic, and a priority can be set for this class - this
> functionality can be leveraged for this purpose, or a new option can be
> added, whatever looks more appropriate). On the other hand, I understand
> your hardware limitation requiring to have the same priority for all VFs and PF
> on the same port.

Thanks for pointing out the "default" option. we will explore this option if we can fit this into our current implementation or not and get back.
Meanwhile, there are comments on other patches in this series and as we are in line with htb offload without the devlink option we will submit
a new version of patches addressing the comments excluding this patch.

Thanks,
Hariprasad k




>
> It's hard to suggest something good here, actually... An obvious thought is to
> let the first netdev that configures HTB choose the priority for unclassified
> traffic, and block attempts from other netdevs to set a different one, but this
> approach also has obvious drawbacks: PF has no special powers here, and it
> can't set the desired priority if PF itself doesn't use HTB (or doesn't configure
> it first, before VFs).
>
> Another idea of mine is to keep the devlink command for enforcement
> purpose, and make the behavior as follows:
>
> 1. The user will pick a priority for unclassified traffic when attaching HTB.
>
> 2. If tl1_rr_prio was set with devlink, block attempts to attach HTB with a
> different default priority.
>
> 3. If tl1_rr_prio wasn't set or was reset, allow attaching HTB to PF with any
> default priority.
>
> This way, VFs can't attach HTB with arbitrary priorities, only with the one that
> the admin has enforced using devlink. At the same time, if VFs aren't used,
> no extra steps are needed to just use HTB on a PF. On the other hand, it adds
> some complexity and may sound controversial to someone. Thoughts?