[PATCH 15/15] habanalabs: never fail hard reset of device

From: Oded Gabbay
Date: Sun Mar 17 2019 - 16:00:07 EST


Hard-reset of our device should never fail, due to dangers of permanent
damage to the H/W.

This patch removes the last place in the reset path where the driver might
exit before doing the actual reset.

Signed-off-by: Oded Gabbay <oded.gabbay@xxxxxxxxx>
---
drivers/misc/habanalabs/device.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index 77d51be66c7e..c51d1062d0bc 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -663,17 +663,9 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset,
/* Go over all the queues, release all CS and their jobs */
hl_cs_rollback_all(hdev);

- if (hard_reset) {
- /* Release kernel context */
- if (hl_ctx_put(hdev->kernel_ctx) != 1) {
- dev_err(hdev->dev,
- "kernel ctx is alive during hard reset\n");
- rc = -EBUSY;
- goto out_err;
- }
-
+ /* Release kernel context */
+ if ((hard_reset) && (hl_ctx_put(hdev->kernel_ctx) == 1))
hdev->kernel_ctx = NULL;
- }

/* Reset the H/W. It will be in idle state after this returns */
hdev->asic_funcs->hw_fini(hdev, hard_reset);
@@ -699,6 +691,13 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset,
if (hard_reset) {
hdev->device_cpu_disabled = false;

+ if (hdev->kernel_ctx) {
+ dev_crit(hdev->dev,
+ "kernel ctx was alive during hard reset, something is terribly wrong\n");
+ rc = -EBUSY;
+ goto out_err;
+ }
+
/* Allocate the kernel context */
hdev->kernel_ctx = kzalloc(sizeof(*hdev->kernel_ctx),
GFP_KERNEL);
--
2.17.1