[PATCH 3.16 012/157] IB/mlx4: Fix race condition between catas error reset and aliasguid flows

From: Ben Hutchings
Date: Sat Aug 10 2019 - 16:53:12 EST


3.16.72-rc1 review patch. If anyone has any objections, please let me know.

------------------

From: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx>

commit 587443e7773e150ae29e643ee8f41a1eed226565 upstream.

Code review revealed a race condition which could allow the catas error
flow to interrupt the alias guid query post mechanism at random points.
Thiis is fixed by doing cancel_delayed_work_sync() instead of
cancel_delayed_work() during the alias guid mechanism destroy flow.

Fixes: a0c64a17aba8 ("mlx4: Add alias_guid mechanism")
Signed-off-by: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxxxx>
Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
---
drivers/infiniband/hw/mlx4/alias_GUID.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/infiniband/hw/mlx4/alias_GUID.c
+++ b/drivers/infiniband/hw/mlx4/alias_GUID.c
@@ -579,8 +579,8 @@ void mlx4_ib_destroy_alias_guid_service(
unsigned long flags;

for (i = 0 ; i < dev->num_ports; i++) {
- cancel_delayed_work(&dev->sriov.alias_guid.ports_guid[i].alias_guid_work);
det = &sriov->alias_guid.ports_guid[i];
+ cancel_delayed_work_sync(&det->alias_guid_work);
spin_lock_irqsave(&sriov->alias_guid.ag_work_lock, flags);
while (!list_empty(&det->cb_list)) {
cb_ctx = list_entry(det->cb_list.next,