Re: [PATCH] net: mellanox: mlx5: fix error return code of mlx5e_stats_flower()

From: Roi Dayan
Date: Sun Mar 14 2021 - 05:54:06 EST




On 2021-03-12 12:47 AM, Saeed Mahameed wrote:
On Tue, 2021-03-09 at 11:44 +0200, Roi Dayan wrote:


On 2021-03-09 10:32 AM, Jia-Ju Bai wrote:


On 2021/3/9 16:24, Roi Dayan wrote:


On 2021-03-09 10:20 AM, Roi Dayan wrote:


On 2021-03-06 3:47 PM, Jia-Ju Bai wrote:
When mlx5e_tc_get_counter() returns NULL to counter or
mlx5_devcom_get_peer_data() returns NULL to peer_esw, no
error return
code of mlx5e_stats_flower() is assigned.
To fix this bug, err is assigned with -EINVAL in these cases.

Reported-by: TOTE Robot <oslab@xxxxxxxxxxxxxxx>

Hey Jia-Ju, What are the conditions for this robot to raise a flag?
sometimes it is totally normal to abort a function and return 0.. i am
just curious to know ?


Signed-off-by: Jia-Ju Bai <baijiaju1990@xxxxxxxxx>
---
  drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12
+++++++++---
  1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 0da69b98f38f..1f2c9da7bd35 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -4380,8 +4380,10 @@ int mlx5e_stats_flower(struct
net_device
*dev, struct mlx5e_priv *priv,
      if (mlx5e_is_offloaded_flow(flow) ||
flow_flag_test(flow, CT)) {
          counter = mlx5e_tc_get_counter(flow);
-        if (!counter)
+        if (!counter) {
+            err = -EINVAL;
              goto errout;
+        }
          mlx5_fc_query_cached(counter, &bytes, &packets,
&lastuse);
      }
@@ -4390,8 +4392,10 @@ int mlx5e_stats_flower(struct
net_device
*dev, struct mlx5e_priv *priv,
       * un-offloaded while the other rule is offloaded.
       */
      peer_esw = mlx5_devcom_get_peer_data(devcom,
MLX5_DEVCOM_ESW_OFFLOADS);
-    if (!peer_esw)
+    if (!peer_esw) {
+        err = -EINVAL;


This is not an error flow, i am curious what are the thresholds of this
robot ?

note here it's not an error. it could be there is no peer esw
so just continue with the stats update.

          goto out;
+    }
      if (flow_flag_test(flow, DUP) &&
          flow_flag_test(flow->peer_flow, OFFLOADED)) {
@@ -4400,8 +4404,10 @@ int mlx5e_stats_flower(struct
net_device
*dev, struct mlx5e_priv *priv,
          u64 lastuse2;
          counter = mlx5e_tc_get_counter(flow->peer_flow);
-        if (!counter)
+        if (!counter) {
+            err = -EINVAL;

this change is problematic. the current goto is to do stats
update with
the first counter stats we got but if you now want to return an
error
then you probably should not do any update at all.

Thanks for your reply :)
I am not sure whether an error code should be returned here?
If so, flow_stats_update(...) should not be called here?


Best wishes,
Jia-Ju Bai


basically flow and peer_flow should be valid and protected from
changes,
and counter should be valid.
it looks like the check here is more of a sanity check if something
goes
wrong but shouldn't. you can just let it be, update the stats from
the
first queried counter.


Roi, let's consider returning an error code here, we shouldn't be
silently returning if we are not expecting these errors,

why would mlx5e_stats_flower() be called if stats are not offloaded ?

Thanks,
Saeed.



yes we can return an error if peer counter missing.
I just pointed out we should probably not call flow_stats_update() if
we do return an error.
the other option, as today, is updating the stats with first counter
stats and because of that we didn't return an error.