Re: [PATCH v2 0/2] ALSA: Fix usb-audio races

From: Takashi Iwai
Date: Tue Jan 03 2017 - 04:15:49 EST


On Tue, 03 Jan 2017 09:47:06 +0100,
Ioan-Adrian Ratiu wrote:
>
> On Tue, 03 Jan 2017, Ioan-Adrian Ratiu <adi@xxxxxxxxxx> wrote:
> > On Mon, 02 Jan 2017, Ioan-Adrian Ratiu <adi@xxxxxxxxxx> wrote:
> >> Many thanks to Takashi Iwai & Sakamoto for their awesome feedback.
> >>
> >> Changes since v1:
> >> * Rebased my fix on top of tiwai's revert and integrated the changes
> >> from the original fix into this.
> >> * Dropped the stop_endpoints() call inside snd_usb_pcm_prepare() and
> >> kept the previously existing snd_usb_endpoint_sync_pending_stop() call.
> >> * Retained the deactivate_urbs() call in snd_usb_pcm_prepare(), I only
> >> removed the can_sleep logic.
> >> * Split the EP_FLAG_RUNNING check in a separate commit to keep the log
> >> clean since this is not part of the race fix.
> >>
> >> Ioan-Adrian Ratiu (2):
> >> ALSA: usb-audio: Fix irq/process data synchronization
> >> ALSA: usb-audio: test EP_FLAG_RUNNING at urb completion
> >>
> >> sound/usb/endpoint.c | 20 ++++++++++----------
> >> sound/usb/endpoint.h | 2 +-
> >> sound/usb/pcm.c | 10 +++++-----
> >> 3 files changed, 16 insertions(+), 16 deletions(-)
> >>
> >> --
> >> 2.11.0
> >
> > @tiwai
> >
> > You mentioned in a previous email to the first version I introduced a bug
> > by unballancing snd_usb_*lock_shutdown(), I think I managed to reproduce it
> > with v2. Is this trace a symptom of the bug you were refering to or is
> > this something newly introduced in v2?
> >
> > I can't for the life of me figure out for sure how snd_usb_*lock_shutdown()
> > ballancing is affected by my patch. Any pointers how to fix this are
> > greatly appreciated.
>
> A bit more context on the trace: usb_audio_disconnect hangs while
> waiting for all pending tasks to finish while the implicit feedback
> endpoint sending errors at usb_submit_urb (err -19).
>
> I could figure these out on my own but at the moment my time is very
> limited and I don't want to keep you guys waiting after me so maybe
> asking about it will be faster.

Yes, this looks like so. snd_usb_lock_shutdown() manages the autopm
with a refcount, and in usb_audio_disconnect(), the driver waits until
the refcount is cleared. The unbalanced call would result in an
endless wait there.


Takashi


>
> Thanks,
> Ionel
>
> >
> > [ 492.558350] INFO: task kworker/2:3:133 blocked for more than 120 seconds.
> > [ 492.558355] Not tainted 4.10.0-rc2-g0c771e68bbc8 #25
> > [ 492.558356] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [ 492.558359] kworker/2:3 D 0 133 2 0x00000000
> > [ 492.558368] Workqueue: usb_hub_wq hub_event
> > [ 492.558370] Call Trace:
> > [ 492.558377] ? __schedule+0x1dd/0x7f0
> > [ 492.558380] ? trace_hardirqs_on+0xd/0x10
> > [ 492.558384] schedule+0x3b/0x90
> > [ 492.558392] usb_audio_disconnect+0x1bf/0x220 [snd_usb_audio]
> > [ 492.558394] ? wake_atomic_t_function+0x50/0x50
> > [ 492.558397] usb_unbind_interface+0x7a/0x260
> > [ 492.558399] device_release_driver_internal+0x158/0x210
> > [ 492.558400] device_release_driver+0xd/0x10
> > [ 492.558401] bus_remove_device+0x10f/0x190
> > [ 492.558411] device_del+0x208/0x330
> > [ 492.558413] ? usb_remove_ep_devs+0x1a/0x30
> > [ 492.558414] usb_disable_device+0x99/0x270
> > [ 492.558416] usb_disconnect+0x87/0x2a0
> > [ 492.558417] hub_event+0x962/0x1530
> > [ 492.558419] process_one_work+0x221/0x4b0
> > [ 492.558420] ? process_one_work+0x1bb/0x4b0
> > [ 492.558421] worker_thread+0x46/0x4f0
> > [ 492.558423] kthread+0x102/0x140
> > [ 492.558424] ? process_one_work+0x4b0/0x4b0
> > [ 492.558425] ? kthread_create_on_node+0x40/0x40
> > [ 492.558427] ret_from_fork+0x27/0x40
> > [ 492.558428]
> > Showing all locks held in the system:
> > [ 492.558433] 2 locks held by khungtaskd/34:
> > [ 492.558434] #0: (rcu_read_lock){......}, at: [<ffffffff810da44f>] watchdog+0x9f/0x4a0
> > [ 492.558438] #1: (tasklist_lock){.+.+..}, at: [<ffffffff81094ffd>] debug_show_all_locks+0x3d/0x1a0
> > [ 492.558449] 6 locks held by kworker/2:3/133:
> > [ 492.558449] #0: ("usb_hub_wq"){.+.+.+}, at: [<ffffffff8106705b>] process_one_work+0x1bb/0x4b0
> > [ 492.558452] #1: ((&hub->events)){+.+.+.}, at: [<ffffffff8106705b>] process_one_work+0x1bb/0x4b0
> > [ 492.558455] #2: (&dev->mutex){......}, at: [<ffffffff814a4e3c>] hub_event+0x5c/0x1530
> > [ 492.558459] #3: (&dev->mutex){......}, at: [<ffffffff814a1abe>] usb_disconnect+0x4e/0x2a0
> > [ 492.558461] #4: (&dev->mutex){......}, at: [<ffffffff8142dd84>] device_release_driver_internal+0x34/0x210
> > [ 492.558464] #5: (register_mutex#3){+.+.+.}, at: [<ffffffffa07de27e>] usb_audio_disconnect+0x2e/0x220 [snd_usb_audio]
> > [ 492.558485] 2 locks held by idn/975:
> > [ 492.558485] #0: (&tty->ldisc_sem){++++++}, at: [<ffffffff815fd25d>] ldsem_down_read+0x2d/0x40
> > [ 492.558489] #1: (&ldata->atomic_read_lock){+.+.+.}, at: [<ffffffff81312160>] n_tty_read+0xb0/0x890
> > [ 492.558494] 2 locks held by idn/1311:
> > [ 492.558494] #0: (&tty->ldisc_sem){++++++}, at: [<ffffffff815fd25d>] ldsem_down_read+0x2d/0x40
> > [ 492.558497] #1: (&ldata->atomic_read_lock){+.+.+.}, at: [<ffffffff81312160>] n_tty_read+0xb0/0x890
> >
> > [ 492.558500] =============================================
>