[Patch v2 1/2] PCI: hv: Fix a race condition when removing the device

From: longli
Date: Thu Apr 22 2021 - 01:46:04 EST


From: Long Li <longli@xxxxxxxxxxxxx>

On removing the device, any work item (hv_pci_devices_present() or
hv_pci_eject_device()) scheduled on workqueue hbus->wq may still be running
and race with hv_pci_remove().

This can happen because the host may send PCI_EJECT or PCI_BUS_RELATIONS(2)
and decide to rescind the channel immediately after that.

Fix this by flushing/stopping the workqueue of hbus before doing hbus remove.

Signed-off-by: Long Li <longli@xxxxxxxxxxxxx>
---

Change in v2: Remove unused bus state hv_pcibus_removed

drivers/pci/controller/pci-hyperv.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 27a17a1e4a7c..fc948a2ed703 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -444,7 +444,6 @@ enum hv_pcibus_state {
hv_pcibus_probed,
hv_pcibus_installed,
hv_pcibus_removing,
- hv_pcibus_removed,
hv_pcibus_maximum
};

@@ -3305,13 +3304,22 @@ static int hv_pci_remove(struct hv_device *hdev)

hbus = hv_get_drvdata(hdev);
if (hbus->state == hv_pcibus_installed) {
+ tasklet_disable(&hdev->channel->callback_event);
+ hbus->state = hv_pcibus_removing;
+ tasklet_enable(&hdev->channel->callback_event);
+ destroy_workqueue(hbus->wq);
+ /*
+ * At this point, no work is running or can be scheduled
+ * on hbus-wq. We can't race with hv_pci_devices_present()
+ * or hv_pci_eject_device(), it's safe to proceed.
+ */
+
/* Remove the bus from PCI's point of view. */
pci_lock_rescan_remove();
pci_stop_root_bus(hbus->pci_bus);
hv_pci_remove_slots(hbus);
pci_remove_root_bus(hbus->pci_bus);
pci_unlock_rescan_remove();
- hbus->state = hv_pcibus_removed;
}

ret = hv_pci_bus_exit(hdev, false);
@@ -3326,7 +3334,6 @@ static int hv_pci_remove(struct hv_device *hdev)
irq_domain_free_fwnode(hbus->sysdata.fwnode);
put_hvpcibus(hbus);
wait_for_completion(&hbus->remove_event);
- destroy_workqueue(hbus->wq);

hv_put_dom_num(hbus->sysdata.domain);

--
2.27.0