Re: BUG: scheduling while atomic in f_fs when gadget remove driver

From: Chen Yu
Date: Wed Sep 28 2016 - 05:50:50 EST


Hi,

å 2016/9/27 18:01, Felipe Balbi åé:
>
> Hi,
>
> Chen Yu <chenyu56@xxxxxxxxxx> writes:
>> Hi All,
>>
>> I'm working on Hikey board based around the HiSilicon Kirin 620, with
>> linaro kernel version 4.8.rc1 and I get below BUG error while
>> extracting USB cable from PC.
>
> which peripheral controller does this one have? Is it dwc3?
>
> I'm very interested in knowing about throughtput of adb push with dwc3 + f_fs.
>

Sorry for a delay, the peripheral controller on HiSilicon Kirin 620 is dwc2.

> Also, do you know if adb can run outside of android environment? I've
> been looking for a proper functionfs user for quite some time now :-(
>

I'm not sure if adb can run outside of android environment, there are some requirements:
1.adbd can run outside of android environment.
2.Do initialization for configfs and f_fs as android. There is an example on Hikey:
https://github.com/96boards-hikey/android_device_linaro_hikey/blob/android-5.0/init.hikey.rc
line 46-65 and line 161-163.

Hope this can help to you!

>> The funtion using f_fs is adb and usb_gadget_unregister_driver will be
>> called after extracting USB cable from PC.
>>
>> [ 89.456512s][pid:1,cpu1,init]BUG: scheduling while atomic: init/1/0x00000002
>> [ 89.456573s]Modules linked in:
>> [ 89.456604s]Preemption disabled at:[<ffffffc0006a6dc0>] composite_disconnect+0x30/0xac
>> [ 89.456665s][pid:1,cpu1,init]TGID: 1 Comm: init
>> [ 89.456695s][pid:1,cpu1,init]Call trace:
>> [ 89.456726s][pid:1,cpu1,init][<ffffffc00008a5e0>] dump_backtrace+0x0/0x15c
>> [ 89.456756s][pid:1,cpu1,init][<ffffffc00008a75c>] show_stack+0x20/0x28
>> [ 89.456756s][pid:1,cpu1,init][<ffffffc001153714>] dump_stack+0x84/0xa8
>> [ 89.456787s][pid:1,cpu1,init][<ffffffc0000cfc5c>] __schedule_bug+0x88/0xdc
>> [ 89.456817s][pid:1,cpu1,init][<ffffffc00115c4f0>] __schedule+0x714/0x854
>> [ 89.456817s][pid:1,cpu1,init][<ffffffc00115c678>] schedule+0x48/0xa4
>> [ 89.456817s][pid:1,cpu1,init][<ffffffc00115cbf0>] schedule_preempt_disabled+0x4c/0xf4
>> [ 89.456848s][pid:1,cpu1,init][<ffffffc00115ea90>] __mutex_lock_slowpath+0xbc/0x1a4
>> [ 89.456878s][pid:1,cpu1,init][<ffffffc00115ebd8>] mutex_lock+0x60/0x64
>> [ 89.456878s][pid:1,cpu1,init][<ffffffc0006beb00>] ffs_func_eps_disable.isra.17+0x54/0x114
>> [ 89.456909s][pid:1,cpu1,init][<ffffffc0006c05a4>] ffs_func_disable+0x30/0xa0
>> [ 89.456909s][pid:1,cpu1,init][<ffffffc0006a6c4c>] reset_config.isra.8+0x44/0x78
>> [ 89.456939s][pid:1,cpu1,init][<ffffffc0006a6dd8>] composite_disconnect+0x48/0xac
>> [ 89.456939s][pid:1,cpu1,init][<ffffffc0006aafd4>] android_disconnect+0x48/0x54
>> [ 89.456970s][pid:1,cpu1,init][<ffffffc0006ad9d0>] usb_gadget_remove_driver+0x58/0xa0
>> [ 89.456970s][pid:1,cpu1,init][<ffffffc0006ada90>] usb_gadget_unregister_driver+0x78/0xc4
>>
>> I checked the codes of composite_disconnect and found
>> spin_lock_irqsave called before reset_config in which
>> ffs_func_eps_disable is called.
>>
>> void composite_disconnect(struct usb_gadget *gadget)
>> {
>> struct usb_composite_dev *cdev = get_gadget_data(gadget);
>> unsigned long flags;
>>
>> /* REVISIT: should we have config and device level
>> * disconnect callbacks?
>> */
>> spin_lock_irqsave(&cdev->lock, flags);
>> if (cdev->config)
>> reset_config(cdev);
>> if (cdev->driver->disconnect)
>> cdev->driver->disconnect(cdev);
>> spin_unlock_irqrestore(&cdev->lock, flags);
>> }
>>
>> static void ffs_func_eps_disable(struct ffs_function *func)
>> {
>> struct ffs_ep *ep = func->eps;
>> struct ffs_epfile *epfile = func->ffs->epfiles;
>> unsigned count = func->ffs->eps_count;
>> unsigned long flags;
>>
>> do {
>> if (epfile)
>> mutex_lock(&epfile->mutex);
>> spin_lock_irqsave(&func->ffs->eps_lock, flags);
>> /* pending requests get nuked */
>> if (likely(ep->ep))
>> usb_ep_disable(ep->ep);
>> ++ep;
>> spin_unlock_irqrestore(&func->ffs->eps_lock, flags);
>>
>> if (epfile) {
>> epfile->ep = NULL;
>> kfree(epfile->read_buffer);
>> epfile->read_buffer = NULL;
>> mutex_unlock(&epfile->mutex);
>> ++epfile;
>> }
>> } while (--count);
>> }
>>
>> Should the epfile->read_buffer be cleared another place and the
>> mutex_lock can be removed in ffs_func_eps_disable?
>
> You are correct. There's a bug there. Can you try to propose a fix for
> it?
>
> thanks
>

I will try to fix it, but I'm engaged in other tasks and can not spend much time on it.

Do you have any suggestions about how to fix it?

thanks