Re: [PATCH net V4 1/4] net/mlx5: SD: Serialize init/cleanup

From: Shay Drori

Date: Thu Apr 30 2026 - 05:46:17 EST




On 30/04/2026 4:42, Jakub Kicinski wrote:
External email: Use caution opening links or attachments


This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net/mlx5: SD: Serialize init/cleanup

This commit protects the Socket-Direct (SD) init/cleanup paths with the
devcom component lock and tracks the SD group state on the primary device.
It skips init if the primary is already UP, and skips cleanup unless the
primary is UP, serializing the bring-up/tear-down sequence across PFs.

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c
index 762c783156b4..d42c283cbb38 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c

[ ... ]

@@ -491,22 +507,34 @@ void mlx5_sd_cleanup(struct mlx5_core_dev *dev)
{
struct mlx5_sd *sd = mlx5_get_sd(dev);
struct mlx5_core_dev *primary, *pos;
+ struct mlx5_sd *primary_sd;
int i;

if (!sd)
return;

+ mlx5_devcom_comp_lock(sd->devcom);
if (!mlx5_devcom_comp_is_ready(sd->devcom))
- goto out;
+ goto out_unlock;

primary = mlx5_sd_get_primary(dev);
+ if (!primary)
+ goto out_unlock;
+
+ primary_sd = mlx5_get_sd(primary);
+ if (primary_sd->state != MLX5_SD_STATE_UP)
+ goto out_unlock;

If the device is unbound during the race window where the devcom component
is ready but the state is still MLX5_SD_STATE_DOWN (for example, just after
sd_register() finishes but before mlx5_sd_init() sets the state to UP), does
this early exit leave the component marked as ready?

Since mlx5_devcom_comp_set_ready(sd->devcom, false) was moved out of
sd_unregister() and into this function, skipping it here means the device will
unregister and free its struct mlx5_sd while the group is still ready.

Could a concurrent mlx5_sd_init() on another peer acquire the devcom lock,
observe that it is ready, and proceed? If the unregistered device was the
primary, mlx5_get_sd(primary) would return NULL, causing a NULL pointer
dereference at primary_sd->state. If it was a secondary, mlx5_sd_init()
would iterate over primary_sd->secondaries and access the freed device
structure, resulting in a use-after-free.

This is correct :(
will fix in next version


+
mlx5_sd_for_each_secondary(i, primary, pos)
sd_cmd_unset_secondary(pos);
sd_cmd_unset_primary(primary);
debugfs_remove_recursive(sd->dfs);

This isn't a regression introduced in this patch, but does storing sd->dfs
locally instead of on the primary device cause debugfs entries to leak?

If a different peer is unbound first, its sd->dfs is NULL, making the call
to debugfs_remove_recursive() a no-op. The cleanup then finishes and marks
the group as unready.

When the original device that created the debugfs directory is later unbound,
it sees the group is no longer ready and skips this block entirely, never
freeing its sd->dfs. Since these files hold pointers to the mlx5_core_dev
structs, accessing them from userspace after the devices are freed could
trigger a use-after-free.

I note this was fixed later in the series by commit "net/mlx5: SD, Keep
multi-pf debugfs entries on primary", but wanted to highlight it here.

like mention, this is fixed later and I don't see a reason to change the
patches order



sd_info(primary, "group id %#x, uncombined\n", sd->group_id);
-out:
+ primary_sd->state = MLX5_SD_STATE_DOWN;
+ mlx5_devcom_comp_set_ready(sd->devcom, false);
+out_unlock:
+ mlx5_devcom_comp_unlock(sd->devcom);
sd_unregister(dev);
sd_cleanup(dev);
}
--
pw-bot: cr