Re: Regression 3.0-rc5+ : khubd blocked

From: Éric Piel
Date: Fri Jul 01 2011 - 12:27:49 EST


Sorry, but from a quick look, it seems to not fix the bug.

I'll try further on Monday.

Cheers,
Eric

Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:

>On Fri, 1 Jul 2011, Ãric Piel wrote:
>
>> Hello,
>> I've come across to what looks like a regression in the kernel a
>> few commits after 3.0-rc5.
>>
>> When I turn off a usb hub, to which my mouse and keyboard are connected,
>> and then turn it on again, they are not detected again. After unplugging
>> it and waiting a few minutes I get a "task khubd:621 blocked for more
>> than 120 seconds."
>>
>> I haven't investigated much. It seems reproducible here on my x86_64
>> laptop. It doesn't seem to happen on a 3.0-rc4. Maybe important, my
>> kernel already has commit 2e34b429a404675dc4fc4ad2ee339eea028da3ca
>> "Merge branch 'usb-linus' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6"
>>
>> Let me know if you need me to investigate more, or maybe there is
>> already a fix for that bug?
>>
>> Below is the whole message of the hung.
>>
>> Cheers,
>> Ãric
>>
>>
>> Jul 1 14:08:16 dutifh kernel: INFO: task khubd:621 blocked for more than 120 seconds.
>> Jul 1 14:08:16 dutifh kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Jul 1 14:08:16 dutifh kernel: khubd D ffff88013a30dfd8 0 621 2 0x00000000
>> Jul 1 14:08:16 dutifh kernel: ffff88013a30db00 0000000000000046 ffff88013a30da10 ffffffffa00dd852
>> Jul 1 14:08:16 dutifh kernel: ffff88013b954320 ffff88013a30dfd8 ffff88013a30dfd8 ffff88013a30dfd8
>> Jul 1 14:08:16 dutifh kernel: ffff88013b891660 ffff88013b954320 ffff8800bb0084c0 dead000000100100
>> Jul 1 14:08:16 dutifh kernel: Call Trace:
>> Jul 1 14:08:16 dutifh kernel: [<ffffffffa00dd852>] ? usb_hcd_giveback_urb+0x72/0xe0 [usbcore]
>> Jul 1 14:08:16 dutifh kernel: [<ffffffff8109e24f>] ? __rcu_read_unlock+0x2f/0x200
>> Jul 1 14:08:16 dutifh kernel: [<ffffffff81388634>] __mutex_lock_slowpath+0xf4/0x190
>> Jul 1 14:08:16 dutifh kernel: [<ffffffff8138806d>] mutex_lock+0x1d/0x40
>> Jul 1 14:08:16 dutifh kernel: [<ffffffffa00e1e22>] usb_set_interface+0x62/0x250 [usbcore]
>> Jul 1 14:08:16 dutifh kernel: [<ffffffffa00e3b9f>] usb_unbind_interface+0x10f/0x180 [usbcore]
>> Jul 1 14:08:16 dutifh kernel: [<ffffffff81269217>] __device_release_driver+0x77/0xd0
>> Jul 1 14:08:16 dutifh kernel: [<ffffffff81269297>] device_release_driver+0x27/0x40
>> Jul 1 14:08:16 dutifh kernel: [<ffffffff81268d83>] bus_remove_device+0x73/0xb0
>> Jul 1 14:08:16 dutifh kernel: [<ffffffff81266845>] device_del+0x125/0x1a0
>> Jul 1 14:08:16 dutifh kernel: [<ffffffffa00e192c>] usb_disable_device+0x7c/0x1a0 [usbcore]
>> Jul 1 14:08:16 dutifh kernel: [<ffffffffa00d9f80>] usb_disconnect+0xa0/0x140 [usbcore]
>
>It appears that this was caused by Sarah's commit
>fccf4e86200b8f5edd9a65da26f150e32ba79808 (USB: Free bandwidth when
>usb_disable_device is called). usb_disconnect() grabs the
>bandwidth_mutex before calling usb_disable_device(), which calls down
>indirectly to usb_set_interface(), which tries to acquire the
>bandwidth_mutex.
>
>This patch should fix the problem. Still, this whole area cries out
>for some serious rewriting.
>
>Alan Stern
>
>
>
>Index: usb-3.0/drivers/usb/core/message.c
>===================================================================
>--- usb-3.0.orig/drivers/usb/core/message.c
>+++ usb-3.0/drivers/usb/core/message.c
>@@ -1273,6 +1273,8 @@ int usb_set_interface(struct usb_device
> interface);
> return -EINVAL;
> }
>+ if (iface->unregistering)
>+ return -ENODEV;
>
> alt = usb_altnum_to_altsetting(iface, alternate);
> if (!alt) {
>
N‹§²æìr¸›yúèšØb²X¬¶ÇvØ^–)Þ{.nÇ+‰·¥Š{±‘êçzX§¶›¡Ü}©ž²ÆzÚ&j:+v‰¨¾«‘êçzZ+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹®w¥¢¸?™¨è­Ú&¢)ßf”ù^jÇy§m…á@A«a¶Úÿ 0¶ìh®å’i