Re: [PATCH V2 net-next 08/12] net: hns3: stop schedule reset service while unloading driver

From: tanhuazhong
Date: Thu Apr 25 2019 - 02:06:22 EST




On 2019/4/24 21:55, Neil Horman wrote:
On Wed, Apr 24, 2019 at 07:05:27PM +0800, Huazhong Tan wrote:
This patch uses HCLGE_STATE_REMOVING/HCLGEVF_STATE_REMOVING flag to
indicate that the driver is unloading, and we should stop new coming
reset service to be scheduled, otherwise, reset service will access
some resource which has been freed by unloading.

Signed-off-by: Huazhong Tan <tanhuazhong@xxxxxxxxxx>
Signed-off-by: Peng Li <lipeng321@xxxxxxxxxx>
---
V1->V2: fixes a flag setting error
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 4 +++-
drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 +++-
drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h | 1 +
3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 4d5568e..ead8308 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2175,7 +2175,8 @@ static void hclge_mbx_task_schedule(struct hclge_dev *hdev)
static void hclge_reset_task_schedule(struct hclge_dev *hdev)
{
- if (!test_and_set_bit(HCLGE_STATE_RST_SERVICE_SCHED, &hdev->state))
+ if (!test_bit(HCLGE_STATE_REMOVING, &hdev->state) &&
+ !test_and_set_bit(HCLGE_STATE_RST_SERVICE_SCHED, &hdev->state))
schedule_work(&hdev->rst_service_task);
}

In what use case do you need an extra bit for this? From my read, this work
task only gets scheduled from:
1) Interrupt handlers
2) Its own service task

Based on the fact that you are calling cancel_work_sync(...rst_service_task)
from the pci teardown routine, irqs should all be disabled on your devices
already (meaning interrupts shouldn't schedule it), and cancel_work_sync
guarantees that rearming cant happen from within its own service task.

Neil


Beside these two cases, when the client detects an error and requests a reset, it will call hclge_reset_event and schedule the reset work task to deal with the request. This may happen after calling cancel_work_sync(...rst_service_task) from the pci teardown routine.

Best Regards, Huazhong


.