RE: [PATCH V2 net-next] net: fec: add CBS offload support

From: Wei Fang
Date: Thu Feb 16 2023 - 07:43:19 EST



> -----Original Message-----
> From: Andrew Lunn <andrew@xxxxxxx>
> Sent: 2023年2月14日 22:29
> To: Wei Fang <wei.fang@xxxxxxx>
> Cc: Alexander Lobakin <alexandr.lobakin@xxxxxxxxx>; Shenwei Wang
> <shenwei.wang@xxxxxxx>; Clark Wang <xiaoning.wang@xxxxxxx>;
> davem@xxxxxxxxxxxxx; edumazet@xxxxxxxxxx; kuba@xxxxxxxxxx;
> pabeni@xxxxxxxxxx; simon.horman@xxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx;
> dl-linux-imx <linux-imx@xxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH V2 net-next] net: fec: add CBS offload support
>
> > Sorry, I'm not very familiar with the configuration of pure software
> > implementation of CBS. I tried to configure the CBS like the
> > following. The bandwidth of queue 1 was set to 30Mbps. And the queue 2
> > is set to 20Mbps. Then one stream were sent the queue 1 and the rate
> > was 50Mbps, the link speed was 1Gbps. But the result seemed that the CBS
> did not take effective in my previous test.
>
> I'm not that familiar with CBS, but that is what i would expect. You are over
> subscribing the queue by 20Mbps, so that 20Mbps gets relegated to best effort.
> And since you have a 1G link, you have plenty of best effort bandwidth.
>
That is not the behavior of CBS, if the bandwidth of the queue is set to 20Mbps, the
maximum transmit rate of the queue shouldn't exceed 20Mbps. I have found the
root cause why the CBS does not take effective in my previous test, because the
default setting of pktgen will bypass the Qdisc.

> As with most QoS queuing, it only really makes a different to packet loss when
> you oversubscribe the link as a whole.
>
> So with your 30Mbps + 20Mbps + BE configuration on a 1G link, send 50Mbps
> + 0Mbps + 1Gbps. 30Mbps of your 50Mbps stream should be guaranteed to
> arrive at the destination. The remaining 20Mbps needs to share the remaining
> 970Mbps of link capacity with the 1G of BE traffic. So you would expect to see
> a few extra Kbps of queue #1 traffic arriving and around 969Mbps of best
> effort traffic.
>
> However, that is not really the case i'm interested in. This discussion started
> from the point that autoneg has resulted in a much smaller link capacity. The
> link is now over subscribed by the CBS configuration. Should the hardware just
> give up and go back to default behaviour, or should it continue to do some
> CBS?
>
See test results and responses below.

> Set lets start with a 7Mbps queue 1 and 5Mbps queue 2, on a link which auto
> negs to 100Mbps. Generate traffic of 8Mbps, 6Mpbs and 100Mbps BE. You
> would expect ~7Mbps, ~5Mbps and 88Mbps to arrive at the link peer. Your two
> CBS flows get there reserved bandwidth, plus a little of the BE. BE gets whats
> remains of the link. Test that and make sure that is what actually happens with
> software CBS, and with your TC offload to hardware.
>
> Now force the link down to 10Mbps. The CBS queues then over subscribe the
> link. Keep with the traffic generator producing 8Mbps, 6Mpbs and 100Mbps
> BE. What i guess the software CBS will do is 7Mbps, 3Mbps and
> 0 BE. You should confirm this with testing.
>
I have tested the pure software CBS today. And below are the test steps and results.
Link speed 100Mbps.
Queue 0: Non-CBS queue, 100Mbps traffic.
Queue 1: CBS queue, 7Mbps bandwidth and 8Mbps traffic.
Queue 2: CBS queue, 5Mbps bandwidth and 6Mbps traffic.
Results: queue 0 egress rate is 86Mbps, queue 1 egress rate is 6Mbps, and queue 2
egress rate is 4Mbps.
Then change the link speed to 10Mbps, queue 0 egress rate is 4Mbps, queue 1 egress
rate is 4Mbps, and queue 2 egress rate is 3Mbps.

> What does this mean for TC offload? You should be aiming for the same
> behaviour. So even when the link is over subscribed, you should still be
> programming the hardware.
>
Beside the test results, I also checked the CBS codes. Unlike hardware implementation,
the pure software method is more flexible, it has four parameters: idleslope, sendslope,
locredit and hicredit. And it can detect the change of link speed and do some adjust.
However, for hardware we only use the idleslope parameter. It's hard for us to make
the hardware behave as the pure software when the link speed changes.
So for the question: Should the hardware just give up and go back to default behaviour,
or should it continue to do some CBS?
I think that we can refer to the behaviors of stmmac and enetc drivers, just keep the
bandwidth ratio constant when the link rate changes. In addition, the link speed change
is a corner case, there is no need to spend any more effort to discuss this matter.