Re: deadlock in "modprobe -r ohci1394" shortly after "modprobeohci1394"

From: Stefan Richter
Date: Mon Nov 20 2006 - 19:21:54 EST


On 20 Nov, Alan Stern wrote:
>> > Is the problem caused by the fact that some of these struct device's
>> > aren't bound to a driver? Remember, bus_rescan_devices() will skip over
>> > anything that already has a driver. Could you solve your problem by
>> > adding a do_nothing driver that would bind to these otherwise unused
>> > devices?

There are three minor problems with this approach:
- some ballast in system memory for the dummy driver struct
- the dummy driver is visible in sysfs
- the (root) user can unbind the driver which will recreate the
preconditions for the deadlock

Nevertheless, here is the patch. I find it ugly, maybe even more so than
my previous up(&dev->parent->sem) hack, but it works as you said.


From: Stefan Richter <stefanr@xxxxxxxxxxxxxxxxx>
Subject: ieee1394: nodemgr: fix deadlock in shutdown

If "modprobe ohci1394" was quickly followed by "modprobe -r ohci1394",
say with 1 second pause in between, the modprobe -r got stuck in
uninterruptible sleep in kthread_stop. At the same time the knodemgrd
slept uninterruptibly in bus_rescan_devices_helper.

This was a regression since Linux 2.6.16,
commit bf74ad5bc41727d5f2f1c6bedb2c1fac394de731
"Hold the device's parent's lock during probe and remove"

The fix (or rather workaround) adds a dummy driver to the hpsb_host
device. Now bus_rescan_devices_helper won't scan it anymore. This
doesn't hurt since we have no drivers which will bind to these devices
and it is unlikely that there will ever be such a driver.

Signed-off-by: Stefan Richter <stefanr@xxxxxxxxxxxxxxxxx>
---
drivers/ieee1394/nodemgr.c | 19 ++++++++++++++++---
1 files changed, 16 insertions(+), 3 deletions(-)

Index: linux-2.6.19-rc4/drivers/ieee1394/nodemgr.c
===================================================================
--- linux-2.6.19-rc4.orig/drivers/ieee1394/nodemgr.c 2006-11-21 00:09:25.000000000 +0100
+++ linux-2.6.19-rc4/drivers/ieee1394/nodemgr.c 2006-11-21 01:00:28.000000000 +0100
@@ -259,9 +259,19 @@ static struct device nodemgr_dev_templat
.release = nodemgr_release_ne,
};

+/* The dummy driver prevents the host devices from being scanned. We have no
+ * useful drivers for them yet, and there is a deadlock possible if the driver
+ * core scans the host device while the host's low-level driver, i.e. the host's
+ * parent device, is being removed. */
+static struct device_driver nodemgr_dev_dummy_driver = {
+ .bus = &ieee1394_bus_type,
+ .name = "none",
+};
+
struct device nodemgr_dev_template_host = {
.bus = &ieee1394_bus_type,
.release = nodemgr_release_host,
+ .driver = &nodemgr_dev_dummy_driver,
};


@@ -703,12 +713,15 @@ static int nodemgr_bus_match(struct devi
if (dev->platform_data != &nodemgr_ud_platform_data)
return 0;

- ud = container_of(dev, struct unit_directory, device);
- driver = container_of(drv, struct hpsb_protocol_driver, driver);
+ /* The dummy driver is not wrapped in a hpsb_protocol_driver */
+ if (drv == &nodemgr_dev_dummy_driver)
+ return 0;

+ ud = container_of(dev, struct unit_directory, device);
if (ud->ne->in_limbo || ud->ignore_driver)
return 0;

+ driver = container_of(drv, struct hpsb_protocol_driver, driver);
for (id = driver->id_table; id->match_flags != 0; id++) {
if ((id->match_flags & IEEE1394_MATCH_VENDOR_ID) &&
id->vendor_id != ud->vendor_id)
@@ -1899,7 +1912,7 @@ int init_ieee1394_nodemgr(void)
class_unregister(&nodemgr_ne_class);
return error;
}
-
+ error = driver_register(&nodemgr_dev_dummy_driver);
hpsb_register_highlevel(&nodemgr_highlevel);
return 0;
}



--
Stefan Richter
-=====-=-==- =-== =-=-=
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/