Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
From: Joe Damato
Date: Fri May 10 2024 - 00:27:29 EST
On Thu, May 09, 2024 at 05:31:06PM -0700, Joe Damato wrote:
> On Thu, May 09, 2024 at 01:16:15PM +0300, Tariq Toukan wrote:
> >
> >
> > On 09/05/2024 9:30, Joe Damato wrote:
> > > On Wed, May 08, 2024 at 07:08:39PM -0700, Jakub Kicinski wrote:
> > > > On Thu, 9 May 2024 01:57:52 +0000 Joe Damato wrote:
> > > > > If I'm following that right and understanding mlx5 (two things I am
> > > > > unlikely to do simultaneously), that sounds to me like:
> > > > >
> > > > > - mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
> > > > > priv->channels.params.num_channels (instead of priv->stats_nch),
> > > >
> > > > Yes, tho, not sure whether the "if i < ...num_channels" is even
> > > > necessary, as core already checks against real_num_rx_queues.
> > > >
> > > > > and when
> > > > > summing mlx5e_sq_stats in the latter function, it's up to
> > > > > priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
> > > > >
> > > > > - mlx5e_get_base_stats accumulates and outputs stats for everything from
> > > > > priv->channels.params.num_channels to priv->stats_nch, and
> > > >
> > > > I'm not sure num_channels gets set to 0 when device is down so possibly
> > > > from "0 if down else ...num_channels" to stats_nch.
> > >
> > > Yea, you were right:
> > >
> > > if (priv->channels.num == 0)
> > > i = 0;
> > > else
> > > i = priv->channels.params.num_channels;
> > > for (; i < priv->stats_nch; i++) {
> > >
> > > Seems to be working now when I adjust the queue count and the test is
> > > passing as I adjust the queue count up or down. Cool.
> > >
> >
> > I agree that get_base should include all inactive queues stats.
> > But it's not straight forward to implement.
> >
> > A few guiding points:
> >
> > Use mlx5e_get_dcb_num_tc(params) for current num_tc.
> >
> > txq_ix (within the real_num_tx_queues) is calculated by c->ix + tc *
> > params->num_channels.
> >
> > The txqsq stats struct is chosen by channel_stats[c->ix]->sq[tc].
> >
> > It means, in the base stats you should include SQ stats for:
> > 1. all SQs of non-active channels, i.e. ch in [params.num_channels,
> > priv->stats_nch), tc in [0, priv->max_opened_tc).
> > 2. all SQs of non-active TCs in active channels [0, params.num_channels), tc
> > in [mlx5e_get_dcb_num_tc(params), priv->max_opened_tc).
> >
> > Now I actually see that the patch has issues in mlx5e_get_queue_stats_tx.
> > You should not loop over all TCs of channel index i.
> > You must do a reverse mapping from "i" to the pair/tuple [ch_ix, tc], and
> > then access a single TXQ stats by priv->channel_stats[ch_ix].sq[tc].
>
> It looks like txq2sq probably will help with this?
>
> Something like:
>
> for (j = 0; j < mlx5e_get_dcb_num_tc(); j++) {
> sq = priv->txq2sq[j];
> if (sq->ch_ix == i) {
> /* this sq->stats is what I need */
> }
> }
>
> Is that right?
This was incorrect, but I think I got it in the v2 I just sent out. When
you have the time, please take a look at that version.
Thanks for the guidance, it was very helpful.
> Not sure if I'm missing something obvious here, sorry, I've been puzzling
> over the mlx5 source for a bit.
>
> BTW: kind of related but in mlx5e_alloc_txqsq the int tc param is unused (I
> think). It might be helpful to struct mlx5e_txqsq to have a tc field and
> then in mlx5e_alloc_txqsq:
>
> sq->tc = tc;
>
> Not sure if that'd be helpful in general, but I could send that as a
> separate patch.