Re: [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer

From: Konrad Rzeszutek Wilk
Date: Fri Nov 15 2019 - 12:59:34 EST


On Fri, Nov 15, 2019 at 01:11:41PM +0200, Roger Quadros wrote:
> +Konrad

You can run Linux with CONFIG_DEBU_DMA and use debug_dma_dump_mappings() to dump
and figure things out. See http://xenbits.xen.org/gitweb/?p=xentesttools/bootstrap.git;a=blob;f=root_image/drivers/dump/dump_dma.c;h=2ba251a2f8c36c24c68762b3e4c9f76ea178238f;hb=HEAD

>
> Jayshri,
>
> On 15/11/2019 12:14, Jayshri Dajiram Pawar wrote:
> > > > > There is a problem when function driver allocate memory for buffer
> > > > > used by DMA from outside dma_mask space.
> > > > > It appears during testing f_tcm driver with cdns3 controller.
> > > > > In the result cdns3 driver was not able to map virtual buffer to DMA.
> > > > > This fix should be improved depending on dma_mask associated with
> > > device.
> > > > > Adding GFP_DMA32 flag while allocationg command data buffer only for
> > > > > 32 bit controllers.
> > > >
> > > > Hi Jayshri,
> > > >
> > > > This issue should be fixed by setting DMA_MASK correctly for
> > > > controller, you can't limit user's memory region. At usb_ep_queue, the
> > > > UDC driver will call DMA MAP API, for Cadence, it is
> > > usb_gadget_map_request_by_dev.
> > > > For the system without SMMU (IO-MMU), it will use swiotlb to make sure
> > > > the data buffer used for DMA transfer is within DMA mask for
> > > > controller, There is a reserved low memory region for debounce buffer
> > > > in swiotlb use case.
> > > >
> > >
> > > /**
> > > * struct usb_request - describes one i/o request
> > > * @buf: Buffer used for data. Always provide this; some controllers
> > > * only use PIO, or don't use DMA for some endpoints.
> > > * @dma: DMA address corresponding to 'buf'. If you don't set this
> > > * field, and the usb controller needs one, it is responsible
> > > * for mapping and unmapping the buffer.
> > > <snip>
> > > */
> > >
> > > So if dma is not set in the usb_request then controller driver is responsible to
> > > do a dma_map of the buffer pointed by 'buf' before it attemps to do DMA.
> > > This should take care of DMA mask and swiotlb.
> > >
> > > This patch is not correct.
> > >
> > Hi Roger,
> >
> > We have scatter-gather disabled.
> > We are getting below error while allocation of cmd data buffer with length 524288 or greater, while writing large size files to device.
> > This error occurred on x86 platform.
> > Because of this reason we have added DMA flag while allocation of buffer.
> >
> > [ 1602.977532] swiotlb_tbl_map_single: 26 callbacks suppressed
> > [ 1602.977536] cdns-usb3 cdns-usb3.1: swiotlb buffer is full (sz: 524288 bytes), total 32768 (slots), used 0 (slots)
>
> Why is swiotlb buffer getting full? How much is it on your system?
> Are you sure that dma_unmap is happening on requests that complete? else we'll just keep hogging the swiotlb buffer.
>
> cheers,
> -roger
>
> > [ 1602.977542] cdns-usb3 cdns-usb3.1: overflow 0x00000007eee00000+524288 of DMA mask ffffffff bus mask 0
> > [ 1602.977555] WARNING: CPU: 6 PID: 285 at kernel/dma/direct.c:43 report_addr+0x37/0x60
> > [ 1602.977556] Modules linked in: target_core_user uio target_core_pscsi target_core_file target_core_iblock usb_f_tcm(OE) target_core_mod cdns3(OE) cdns3_pci_wrap(OE) roles(E) libcomposite(OE) udc_core(OE) xt_multiport iptable_filter bpfilter snd_hda_codec_hdmi nls_iso8859_1 i915 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel irqbypass snd_hda_codec snd_hda_core snd_hwdep snd_pcm drm_kms_helper snd_seq_midi snd_seq_midi_event crct10dif_pclmul snd_rawmidi crc32_pclmul drm snd_seq ghash_clmulni_intel snd_seq_device aesni_intel snd_timer mei_me i2c_algo_bit aes_x86_64 crypto_simd cryptd fb_sys_fops glue_helper snd mei input_leds syscopyarea intel_cstate sysfillrect intel_rapl_perf sysimgblt hp_wmi soundcore sparse_keymap serio_raw wmi_bmof tpm_infineon mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid e1000e psmouse ahci lpc_ich libahci i2c_i801 wmi
> > [ 1602.977605] video
> > [ 1602.977613] CPU: 6 PID: 285 Comm: kworker/6:2 Tainted: G OE 5.2.0-rc3cdns3-jayshri-stream-common+ #7
> > [ 1602.977615] Hardware name: Hewlett-Packard HP EliteDesk 800 G1 TWR/18E4, BIOS L01 v02.21 12/17/2013
> > [ 1602.977623] Workqueue: tcm_usb_gadget usbg_cmd_work [usb_f_tcm]
> > [ 1602.977628] RIP: 0010:report_addr+0x37/0x60
> > [ 1602.977631] Code: 48 8b 87 28 02 00 00 48 89 75 f8 48 85 c0 74 2a 4c 8b 00 b8 fe ff ff ff 49 39 c0 76 11 80 3d af 61 72 01 00 0f 84 df 06 00 00 <0f> 0b c9 c3 48 83 bf 38 02 00 00 00 74 f2 eb e3 80 3d 93 61 72 01
> > [ 1602.977634] RSP: 0018:ffffa0a6834dfc00 EFLAGS: 00010046
> > [ 1602.977636] RAX: 0000000000000000 RBX: ffff8ec574aeb010 RCX: 0000000000000000
> > [ 1602.977638] RDX: 0000000000000007 RSI: 0000000000000086 RDI: 0000000000000000
> > [ 1602.977640] RBP: ffffa0a6834dfc08 R08: 0000000000000569 R09: ffffffffa2189fb8
> > [ 1602.977642] R10: 0000000000000069 R11: ffffa0a6834df940 R12: 0000000000080000
> > [ 1602.977644] R13: ffff8ec5ad536218 R14: ffff8ec5ad693800 R15: ffff8ec5ad693800
> > [ 1602.977647] FS: 0000000000000000(0000) GS:ffff8ec5be980000(0000) knlGS:0000000000000000
> > [ 1602.977649] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 1602.977651] CR2: 00007f05a56b7000 CR3: 000000036fc0a006 CR4: 00000000001606e0
> > [ 1602.977653] Call Trace:
> > [ 1602.977660] dma_direct_map_page+0xdf/0xf0
> > [ 1602.977669] usb_gadget_map_request_by_dev+0x17a/0x190 [udc_core]
> > [ 1602.977679] __cdns3_gadget_ep_queue.isra.30+0x149/0x2e0 [cdns3]
> > [ 1602.977686] ? kmalloc_order+0x18/0x40
> > [ 1602.977693] cdns3_gadget_ep_queue+0x53/0x100 [cdns3]
> > [ 1602.977698] usb_ep_queue+0x36/0xa0 [udc_core]
> > [ 1602.977704] usbg_send_write_request+0x1ae/0x250 [usb_f_tcm]
> > [ 1602.977731] transport_generic_new_cmd+0x1bc/0x320 [target_core_mod]
> > [ 1602.977749] transport_handle_cdb_direct+0x42/0x60 [target_core_mod]
> > [ 1602.977766] target_submit_cmd_map_sgls+0x176/0x230 [target_core_mod]
> > [ 1602.977771] ? __switch_to_asm+0x40/0x70
> > [ 1602.977788] target_submit_cmd+0x26/0x30 [target_core_mod]
> > [ 1602.977794] usbg_cmd_work+0x60/0xd0 [usb_f_tcm]
> > [ 1602.977799] process_one_work+0x20f/0x410
> > [ 1602.977802] worker_thread+0x34/0x400
> > [ 1602.977807] kthread+0x120/0x140
> > [ 1602.977811] ? process_one_work+0x410/0x410
> > [ 1602.977815] ? __kthread_parkme+0x70/0x70
> > [ 1602.977818] ret_from_fork+0x35/0x40
> > [ 1602.977822] ---[ end trace 70f27f846049ae32 ]---
> > [ 1602.977826] cdns-usb3 cdns-usb3.1: failed to map buffer
> > [ 1602.977853] uasp_send_write_request(695)
> >
> > Regards,
> > Jayshri
> >
> > > cheers,
> > > -roger
> > >
> > > > Peter
> > > >
> > > > >
> > > > > Signed-off-by: Pawel Laszczak <pawell@xxxxxxxxxxx>
> > > > > Signed-off-by: Jayshri Pawar <jpawar@xxxxxxxxxxx>
> > > > > ---
> > > > > drivers/usb/gadget/function/f_tcm.c | 20 ++++++++++++++------
> > > > > include/linux/usb/gadget.h | 2 ++
> > > > > 2 files changed, 16 insertions(+), 6 deletions(-)
> > > > >
> > > > > diff --git a/drivers/usb/gadget/function/f_tcm.c
> > > > > b/drivers/usb/gadget/function/f_tcm.c
> > > > > index 36504931b2d1..a78d5fad3d84 100644
> > > > > --- a/drivers/usb/gadget/function/f_tcm.c
> > > > > +++ b/drivers/usb/gadget/function/f_tcm.c
> > > > > @@ -213,7 +213,8 @@ static int bot_send_read_response(struct
> > > usbg_cmd *cmd)
> > > > > }
> > > > >
> > > > > if (!gadget->sg_supported) {
> > > > > - cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC);
> > > > > + cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC |
> > > > > + gadget->dma_flag);
> > > > > if (!cmd->data_buf)
> > > > > return -ENOMEM;
> > > > >
> > > > > @@ -257,7 +258,8 @@ static int bot_send_write_request(struct
> > > usbg_cmd *cmd)
> > > > > }
> > > > >
> > > > > if (!gadget->sg_supported) {
> > > > > - cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_KERNEL);
> > > > > + cmd->data_buf = kmalloc(se_cmd->data_length, GFP_KERNEL
> > > |
> > > > > + gadget->dma_flag);
> > > > > if (!cmd->data_buf)
> > > > > return -ENOMEM;
> > > > >
> > > > > @@ -305,6 +307,7 @@ static void bot_cmd_complete(struct usb_ep *ep,
> > > struct usb_request *req)
> > > > > static int bot_prepare_reqs(struct f_uas *fu)
> > > > > {
> > > > > int ret;
> > > > > + struct usb_gadget *gadget = fuas_to_gadget(fu);
> > > > >
> > > > > fu->bot_req_in = usb_ep_alloc_request(fu->ep_in, GFP_KERNEL);
> > > > > if (!fu->bot_req_in)
> > > > > @@ -327,7 +330,8 @@ static int bot_prepare_reqs(struct f_uas *fu)
> > > > > fu->bot_status.req->complete = bot_status_complete;
> > > > > fu->bot_status.csw.Signature = cpu_to_le32(US_BULK_CS_SIGN);
> > > > >
> > > > > - fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL);
> > > > > + fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL |
> > > > > + gadget->dma_flag);
> > > > > if (!fu->cmd.buf)
> > > > > goto err_buf;
> > > > >
> > > > > @@ -515,7 +519,8 @@ static int uasp_prepare_r_request(struct
> > > usbg_cmd *cmd)
> > > > > struct uas_stream *stream = cmd->stream;
> > > > >
> > > > > if (!gadget->sg_supported) {
> > > > > - cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC);
> > > > > + cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC |
> > > > > + gadget->dma_flag);
> > > > > if (!cmd->data_buf)
> > > > > return -ENOMEM;
> > > > >
> > > > > @@ -763,11 +768,13 @@ static int uasp_alloc_stream_res(struct f_uas
> > > > > *fu, struct uas_stream *stream)
> > > > >
> > > > > static int uasp_alloc_cmd(struct f_uas *fu)
> > > > > {
> > > > > + struct usb_gadget *gadget = fuas_to_gadget(fu);
> > > > > fu->cmd.req = usb_ep_alloc_request(fu->ep_cmd, GFP_KERNEL);
> > > > > if (!fu->cmd.req)
> > > > > goto err;
> > > > >
> > > > > - fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL);
> > > > > + fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL |
> > > > > + gadget->dma_flag);
> > > > > if (!fu->cmd.buf)
> > > > > goto err_buf;
> > > > >
> > > > > @@ -980,7 +987,8 @@ static int usbg_prepare_w_request(struct
> > > usbg_cmd *cmd, struct usb_request *req)
> > > > > struct usb_gadget *gadget = fuas_to_gadget(fu);
> > > > >
> > > > > if (!gadget->sg_supported) {
> > > > > - cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC);
> > > > > + cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC |
> > > > > + gadget->dma_flag);
> > > > > if (!cmd->data_buf)
> > > > > return -ENOMEM;
> > > > >
> > > > > diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
> > > > > index 124462d65eac..d6c9cd222600 100644
> > > > > --- a/include/linux/usb/gadget.h
> > > > > +++ b/include/linux/usb/gadget.h
> > > > > @@ -373,6 +373,7 @@ struct usb_gadget_ops {
> > > > > * @connected: True if gadget is connected.
> > > > > * @lpm_capable: If the gadget max_speed is FULL or HIGH, this flag
> > > > > * indicates that it supports LPM as per the LPM ECN & errata.
> > > > > + * @dma_flag: dma zone to be used for buffer allocation.
> > > > > *
> > > > > * Gadgets have a mostly-portable "gadget driver" implementing device
> > > > > * functions, handling all usb configurations and interfaces.
> > > > > Gadget @@ -427,6 +428,7 @@ struct usb_gadget {
> > > > > unsigned deactivated:1;
> > > > > unsigned connected:1;
> > > > > unsigned lpm_capable:1;
> > > > > + unsigned int dma_flag;
> > > > > };
> > > > > #define work_to_gadget(w) (container_of((w), struct usb_gadget,
> > > work))
> > > > >
> > > > > --
> > > > > 2.20.1
> > > > >
> > > >
> > >
> > > --
> > > Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> > > Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
>
> --
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki