Re: [net-next Patch v2 4/5] octeontx2-pf: Add devlink support to configure TL1 RR_PRIO

From: Maxim Mikityanskiy
Date: Mon Jan 23 2023 - 15:06:19 EST


On Mon, Jan 23, 2023 at 05:03:01PM +0000, Hariprasad Kelam wrote:
>
>
> >
> > On Fri, Jan 20, 2023 at 08:50:16AM +0000, Hariprasad Kelam wrote:
> > >
> > > On Wed, Jan 18, 2023 at 04:21:06PM +0530, Hariprasad Kelam wrote:
> > > > All VFs and PF netdev shares same TL1 schedular, each interface PF
> > > > or VF will have different TL2 schedulars having same parent TL1.
> > > > The
> > > > TL1 RR_PRIO value is static and PF/VFs use the same value to
> > > > configure its
> > > > TL2 node priority in case of DWRR children.
> > > >
> > > > This patch adds support to configure TL1 RR_PRIO value using devlink.
> > > > The TL1 RR_PRIO can be configured for each PF. The VFs are not
> > > > allowed to configure TL1 RR_PRIO value. The VFs can get the
> > > > RR_PRIO value from the mailbox NIX_TXSCH_ALLOC response parameter aggr_lvl_rr_prio.
> > >
> > > I asked this question under v1, but didn't get an answer, could you shed some light?
> > >
> > > "Could you please elaborate how these priorities of Transmit Levels are related to HTB priorities? I don't seem to understand why something has to be configured with devlink in addition to HTB.
> > >
> > > SMQ (send meta-descriptor queue) and MDQ (meta-descriptor queue) are the first transmit levels.
> > > Each send queue is mapped with SMQ.
> > >
> > > As mentioned in cover letter, each egress packet needs to traverse all transmit levels starting from TL5 to TL1.
> >
> > Yeah, I saw that, just some details about your hardware which might be obvious to you aren't so clear to me...
> >
> > Do these transmit levels map to "layers" of HTB hierarchy? Does it look like this, or is my understanding completely wrong?
> >
> > TL1 [HTB root node]
> > / \
> > TL2 [HTB node] [HTB node]
> > / \ |
> > TL3 [HTB node] [HTB node] [HTB node]
> > ... ...
> >
> > Transmit levels to HTB mapping is correct.
> >
> >
> >
> > > This applies to non-QOS Send queues as well.
> > >
> > > SMQ/MDQ --> TL4 -->TL3 -->TL2 -->TL1
> > >
> > > By default non QOS queues use a default hierarchy with round robin priority.
> > > To avoid conflict with QOS tree priorities, with devlink user can choose round-robin priority before Qos tree formation.
> >
> > So, this priority that you set with devlink is basically a weight of unclassified (default) traffic for round robin between unclassified and classified traffic, right? I.e. you have two hierarchies (one for HTB, another for non-QoS queue), and you do DWRR between them, according to this priority?
> >
> >
> > Not exactly, In the given scenario where multiple vfs are attached to PF netdev.
> > each VF unclassified traffic forms a hierarchy and PF also forms a hierarchy for unclassified traffic.
> >
> > Now traffic from these all tress(multiple vfs and PFs) are aggregated at TL1. HW performs DWRR among them since these TL2 queues (belonging to each pf and vf netdevs) will be configured with the same priority by the driver.
> >
> > Currently, this priority is hard coded. Now we are providing this as a configurable value to the user.
> >
> > Now if a user adds a HTB node, this will have strict priority at TL2 level since DWRR priority is different this traffic won't be affected by DWRR unclassified traffic.
>
> So, did I get it right now?
>
> [strict priority**]
> /---------/ \-----\
> | |
> [DWRR*] |
> /---------------/ | \---------------\ |
> | | | |
> [ Hierarchy for ] [ Hierarchy for ] [ Hierarchy for ] |
> [ unclassified ] [ unclassified ] [ unclassified ] |
> [traffic from PF] [traffic from VF1] [traffic from VF2] |
> [ *** ] [ *** ] [ *** ] |
> |
> [HTB hierarchy using]
> [ strict priority ]
> [ between nodes ]
>
>
>
> Adjusted picture
>
> /--------------------------------------------------------------------------------/ Transmit level 1
> | |
> [DWRR*] [ priority 6 ] [strict priority** ] [ priority 0 ] Transmit level 2
> /---------------/ | \-----------------------------------\ |
> | | | |
> [ Hierarchy for ] [ Hierarchy for ] [ Hierarchy for ] [ Hierarchy for ]
> [ unclassified ] [ unclassified ] [ unclassified ] [ strict priority ]
> [traffic from PF] [traffic from VF1] [traffic from VF2]
> [ *** ] [ *** ] [ *** ]
>
>
>
> As far as I understand, you set priorities at ***, which affect DWRR balancing at *, but it's not clear to me how the selection at ** works.
> Does the HTB hierarchy have some fixed priority, ?
>
> Hardware supports priorities from 0 to 7. lower value has high priority.
> nodes having the same priority are treated as DWRR childs.
>
> i.e. the user can set priority for unclassified traffic to be higher or lower than HTB traffic?
>
> Yes its user configurable, unclassified traffic priority can be higher or lower than HTB traffic if a user wishes to configure it.
>
> Please also point me at any inaccuracies in my picture, I really want to understand the algorithm here, because configuring additional priorities outside of HTB looks unusual to me.
>
> Please check the adjusted picture. Let us assume a user has set the priority as 6 for DWRR (unclassified traffic) and HTB strict priority as 0.
> Once all traffic reaches TL2, Now hardware algorithm first pics HTB strict priority and processes DWRR later according to their priorities.

OK, I seem to get it now, thanks for the explanation!

How do you set the priority for HTB, though? You mentioned this command
to set priority of unclassified traffic:

devlink -p dev param set pci/0002:04:00.0 name tl1_rr_prio value 6 \
cmode runtime

But what is the command to change priority for HTB?

What bothers me about using devlink to configure HTB priority is:

1. Software HTB implementation doesn't have this functionality, and it
always prioritizes unclassified traffic. As far as I understand, the
rule for tc stuff is that all features must have a reference
implementation in software.

2. Adding a flag (prefer unclassified vs prefer classified) to HTB
itself may be not straightforward, because your devlink command has a
second purpose of setting priorities between PFs/VFs, and it may
conflict with the HTB flag.

>
> >
> > Thanks,
> > Hariprasad k