Re: [PATCH] usb: host: xhci: Replace bus lock with host controller lock

From: Chris Bainbridge
Date: Fri Feb 05 2016 - 10:14:17 EST


Running task list at fail point:

[ 8.978405] kworker/3:0 R running task 11760 24 2 0x00000000
[ 8.979624] Workqueue: usb_hub_wq hub_event
[ 8.980831] ffff880260613af8 ffff880260613ac0 00000000811247ed ffff8802604597c0
[ 8.982061] ffff880260608000 ffff880260614000 0000000000000100 ffff880260613b70
[ 8.983280] ffff880260613c14 0000000000001388 ffff880260613b10 ffffffff81b9bab7
[ 8.984500] Call Trace:
[ 8.985698] [<ffffffff81b9bab7>] schedule+0x37/0x90
[ 8.986918] [<ffffffff817da7cd>] usb_kill_urb+0x8d/0xd0
[ 8.988130] [<ffffffff8111e5e0>] ? wake_up_atomic_t+0x30/0x30
[ 8.989343] [<ffffffff817dafbe>] usb_start_wait_urb+0xbe/0x150
[ 8.990561] [<ffffffff817db10c>] usb_control_msg+0xbc/0xf0
[ 8.991766] [<ffffffff817d07de>] hub_port_init+0x51e/0xb70
[ 8.992964] [<ffffffff817d4697>] hub_event+0x817/0x1570
[ 8.994156] [<ffffffff810f3e6f>] process_one_work+0x1ff/0x620
[ 8.995342] [<ffffffff810f3dcf>] ? process_one_work+0x15f/0x620
[ 8.996528] [<ffffffff810f4684>] worker_thread+0x64/0x4b0
[ 8.997707] [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390
[ 8.998883] [<ffffffff810fa7f5>] kthread+0x105/0x120
[ 9.000056] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200
[ 9.001241] [<ffffffff81ba183f>] ret_from_fork+0x3f/0x70
[ 9.002420] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200

[ 9.094152] kworker/3:1 R running task 11496 238 2 0x00000000
[ 9.095361] Workqueue: pm pm_runtime_work
[ 9.096545] ffff88025fc13ae0 ffff88025fc13aa8 00000003811247ed ffff880260bc0000
[ 9.097767] ffff8802603aaf80 ffff88025fc14000 ffff88025cc7a520 ffff88007c4b7978
[ 9.098986] ffff88025cc7a520 ffff88007c4b7978 ffff88025fc13af8 ffffffff81b9bab7
[ 9.100213] Call Trace:
[ 9.101415] [<ffffffff81b9bab7>] schedule+0x37/0x90
[ 9.102626] [<ffffffff817da7cd>] usb_kill_urb+0x8d/0xd0
[ 9.103835] [<ffffffff8111e5e0>] ? wake_up_atomic_t+0x30/0x30
[ 9.105035] [<ffffffff817d264b>] hub_quiesce+0x6b/0xa0
[ 9.106226] [<ffffffff817d281a>] hub_suspend+0x12a/0x250
[ 9.107417] [<ffffffff817de4b5>] usb_suspend_both+0x95/0x1d0
[ 9.108600] [<ffffffff817df64e>] usb_runtime_suspend+0x2e/0x70
[ 9.109783] [<ffffffff817df620>] ? usb_probe_interface+0x310/0x310
[ 9.110958] [<ffffffff8170d5ed>] __rpm_callback+0x2d/0x70
[ 9.112130] [<ffffffff817df620>] ? usb_probe_interface+0x310/0x310
[ 9.113305] [<ffffffff8170d64d>] rpm_callback+0x1d/0x90
[ 9.114482] [<ffffffff8170dbdb>] rpm_suspend+0x14b/0x750
[ 9.115664] [<ffffffff8170f697>] __pm_runtime_suspend+0x57/0x90
[ 9.116852] [<ffffffff817df6b0>] ? usb_runtime_resume+0x20/0x20
[ 9.118034] [<ffffffff817df6d5>] usb_runtime_idle+0x25/0x30
[ 9.119210] [<ffffffff8170d5ed>] __rpm_callback+0x2d/0x70
[ 9.120382] [<ffffffff8170e501>] rpm_idle+0x221/0x410
[ 9.121541] [<ffffffff8170f7e9>] pm_runtime_work+0xa9/0xc0
[ 9.122693] [<ffffffff810f3e6f>] process_one_work+0x1ff/0x620
[ 9.123850] [<ffffffff810f3dcf>] ? process_one_work+0x15f/0x620
[ 9.124993] [<ffffffff810f4684>] worker_thread+0x64/0x4b0
[ 9.126134] [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390
[ 9.127272] [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390
[ 9.128403] [<ffffffff810fa7f5>] kthread+0x105/0x120
[ 9.129524] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200
[ 9.130653] [<ffffffff81ba183f>] ret_from_fork+0x3f/0x70
[ 9.131781] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200

[ 9.864737] kworker/3:2 R running task 11888 1348 2 0x00000008
[ 9.865947] Workqueue: usb_hub_wq hub_event
[ 9.867152] ffff88025dbcdf00 ffff88025dbd3b88 ffffffff81109a8c ffffffff811099a9
[ 9.868386] ffff88025dbcdf00 0000000000000000 ffff88025dbce270 ffff88025dbd3bb8
[ 9.869610] ffffffff81109c04 ffff88025dac8000 ffff88025dac8368 ffff88025c8a4000
[ 9.870837] Call Trace:
[ 9.872044] [<ffffffff81109a8c>] sched_show_task+0x15c/0x260
[ 9.873248] [<ffffffff811099a9>] ? sched_show_task+0x79/0x260
[ 9.874456] [<ffffffff81109c04>] show_state_filter+0x74/0xc0
[ 9.875664] [<ffffffff817fd36d>] xhci_setup_device+0x53d/0xa40
[ 9.876871] [<ffffffff817fd87e>] xhci_address_device+0xe/0x10
[ 9.878068] [<ffffffff817d047f>] hub_port_init+0x1bf/0xb70
[ 9.879262] [<ffffffff811247ed>] ? trace_hardirqs_on+0xd/0x10
[ 9.880465] [<ffffffff817d4697>] hub_event+0x817/0x1570
[ 9.881668] [<ffffffff810f3e6f>] process_one_work+0x1ff/0x620
[ 9.882869] [<ffffffff810f3dcf>] ? process_one_work+0x15f/0x620
[ 9.884074] [<ffffffff810f4684>] worker_thread+0x64/0x4b0
[ 9.885268] [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390
[ 9.886457] [<ffffffff810fa7f5>] kthread+0x105/0x120
[ 9.887634] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200
[ 9.888817] [<ffffffff81ba183f>] ret_from_fork+0x3f/0x70
[ 9.889995] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200

[ 9.891174] kworker/3:3 R running task 10808 1350 2 0x00000000
[ 9.892367] Workqueue: events_freezable usb_stor_scan_dwork
[ 9.893560] ffff88025dbdbb78 ffff88026a0ccd80 000000006a0ccd80 ffff8802604597c0
[ 9.894773] ffff88025dbcc740 ffff88025dbdc000 ffff88025dbdbbb0 ffff88026a0ccd80
[ 9.896004] ffff88026a0ccd80 0000000000000003 ffff88025dbdbb90 ffffffff81b9bab7
[ 9.897233] Call Trace:
[ 9.898444] [<ffffffff81b9bab7>] schedule+0x37/0x90
[ 9.899670] [<ffffffff81ba026c>] schedule_timeout+0x17c/0x2f0
[ 9.900893] [<ffffffff81143f10>] ? init_timer_on_stack_key+0x40/0x40
[ 9.902099] [<ffffffff81b9c98d>] wait_for_completion_interruptible_timeout+0xcd/0x160
[ 9.903324] [<ffffffff81106620>] ? wake_up_q+0x70/0x70
[ 9.904536] [<ffffffff818197e8>] usb_stor_msg_common+0xd8/0x130
[ 9.905751] [<ffffffff818198d6>] usb_stor_control_msg+0x96/0xb0
[ 9.906961] [<ffffffff8181a6d1>] usb_stor_Bulk_max_lun+0x51/0xa0
[ 9.908163] [<ffffffff8181afbf>] usb_stor_scan_dwork+0x7f/0xe0
[ 9.909361] [<ffffffff810f3e6f>] process_one_work+0x1ff/0x620
[ 9.910563] [<ffffffff810f3dcf>] ? process_one_work+0x15f/0x620
[ 9.911762] [<ffffffff810f4684>] worker_thread+0x64/0x4b0
[ 9.912976] [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390
[ 9.914190] [<ffffffff810fa7f5>] kthread+0x105/0x120
[ 9.915396] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200
[ 9.916607] [<ffffffff81ba183f>] ret_from_fork+0x3f/0x70
[ 9.917817] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200

[ 11.003856] 7 locks held by kworker/3:2/1348:
[ 11.005032] #0: ("usb_hub_wq"){.+.+.+}, at: [<ffffffff810f3dcf>] process_one_work+0x15f/0x620
[ 11.006250] #1: ((&hub->events)){+.+.+.}, at: [<ffffffff810f3dcf>] process_one_work+0x15f/0x620
[ 11.007479] #2: (&dev->mutex){......}, at: [<ffffffff817d3ee4>] hub_event+0x64/0x1570
[ 11.008723] #3: (&port_dev->status_lock){+.+.+.}, at: [<ffffffff817d4685>] hub_event+0x805/0x1570
[ 11.009975] #4: (&bus->usb_address0_mutex){+.+.+.}, at: [<ffffffff817d031d>] hub_port_init+0x5d/0xb70
[ 11.011233] #5: (&xhci->mutex){+.+.+.}, at: [<ffffffff817fce8f>] xhci_setup_device+0x5f/0xa40
[ 11.012496] #6: (tasklist_lock){.+.+..}, at: [<ffffffff8112319f>] debug_show_all_locks+0x3f/0x1b0


kworker/3:0 and kworker/3:2 are both in hub_port_init. (I don't think
the other running threads are relevant but include them for
completeness).

Some of the functions appear to be inlined, the exact call chain is:

hub_port_init
usb_get_device_descriptor
usb_get_descriptor
usb_control_msg
usb_internal_control_msg
usb_start_wait_urb
usb_submit_urb / wait_for_completion_timeout / usb_kill_urb

hub_port_init
hub_set_address
xhci_address_device
xhci_setup_device

So xhci_setup_device is entered while there is an outstanding URB on the
other bus. Unless anyone can think of a better way to fix this I'll make
the requested changes and resend my patch.