Re: [PATCH 4.19 50/54] drm/msm: stop abusing dma_map/unmap for cache

From: Naresh Kamboju
Date: Wed Apr 22 2020 - 16:24:22 EST


On Mon, 13 Apr 2020 at 13:56, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Apr 13, 2020 at 05:03:26AM +0000, nobuhiro1.iwamatsu@xxxxxxxxxxxxx wrote:
> > Hi,
> >
> > > -----Original Message-----
> > > From: stable-owner@xxxxxxxxxxxxxxx [mailto:stable-owner@xxxxxxxxxxxxxxx] On Behalf Of Greg Kroah-Hartman
> > > Sent: Saturday, April 11, 2020 9:10 PM
> > > To: linux-kernel@xxxxxxxxxxxxxxx
> > > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>; stable@xxxxxxxxxxxxxxx; Stephen Boyd <sboyd@xxxxxxxxxx>; Stephen
> > > Boyd <swboyd@xxxxxxxxxxxx>; Jordan Crouse <jcrouse@xxxxxxxxxxxxxx>; Rob Clark <robdclark@xxxxxxxxxxxx>; Sean Paul
> > > <seanpaul@xxxxxxxxxxxx>; Lee Jones <lee.jones@xxxxxxxxxx>
> > > Subject: [PATCH 4.19 50/54] drm/msm: stop abusing dma_map/unmap for cache
> > >
> > > From: Rob Clark <robdclark@xxxxxxxxxxxx>
> > >
> > > commit 0036bc73ccbe7e600a3468bf8e8879b122252274 upstream.
> > >
> > > Recently splats like this started showing up:
> > >
> > > WARNING: CPU: 4 PID: 251 at drivers/iommu/dma-iommu.c:451 __iommu_dma_unmap+0xb8/0xc0
> > > Modules linked in: ath10k_snoc ath10k_core fuse msm ath mac80211 uvcvideo cfg80211 videobuf2_vmalloc videobuf2_memops
> > > vide
> > > CPU: 4 PID: 251 Comm: kworker/u16:4 Tainted: G W 5.2.0-rc5-next-20190619+ #2317
> > > Hardware name: LENOVO 81JL/LNVNB161216, BIOS 9UCN23WW(V1.06) 10/25/2018
> > > Workqueue: msm msm_gem_free_work [msm]
> > > pstate: 80c00005 (Nzcv daif +PAN +UAO)
> > > pc : __iommu_dma_unmap+0xb8/0xc0
> > > lr : __iommu_dma_unmap+0x54/0xc0
> > > sp : ffff0000119abce0
> > > x29: ffff0000119abce0 x28: 0000000000000000
> > > x27: ffff8001f9946648 x26: ffff8001ec271068
> > > x25: 0000000000000000 x24: ffff8001ea3580a8
> > > x23: ffff8001f95ba010 x22: ffff80018e83ba88
> > > x21: ffff8001e548f000 x20: fffffffffffff000
> > > x19: 0000000000001000 x18: 00000000c00001fe
> > > x17: 0000000000000000 x16: 0000000000000000
> > > x15: ffff000015b70068 x14: 0000000000000005
> > > x13: 0003142cc1be1768 x12: 0000000000000001
> > > x11: ffff8001f6de9100 x10: 0000000000000009
> > > x9 : ffff000015b78000 x8 : 0000000000000000
> > > x7 : 0000000000000001 x6 : fffffffffffff000
> > > x5 : 0000000000000fff x4 : ffff00001065dbc8
> > > x3 : 000000000000000d x2 : 0000000000001000
> > > x1 : fffffffffffff000 x0 : 0000000000000000
> > > Call trace:
> > > __iommu_dma_unmap+0xb8/0xc0
> > > iommu_dma_unmap_sg+0x98/0xb8
> > > put_pages+0x5c/0xf0 [msm]
> > > msm_gem_free_work+0x10c/0x150 [msm]
> > > process_one_work+0x1e0/0x330
> > > worker_thread+0x40/0x438
> > > kthread+0x12c/0x130
> > > ret_from_fork+0x10/0x18
> > > ---[ end trace afc0dc5ab81a06bf ]---
> > >
> > > Not quite sure what triggered that, but we really shouldn't be abusing
> > > dma_{map,unmap}_sg() for cache maint.
> > >
> > > Cc: Stephen Boyd <sboyd@xxxxxxxxxx>
> > > Tested-by: Stephen Boyd <swboyd@xxxxxxxxxxxx>
> > > Reviewed-by: Jordan Crouse <jcrouse@xxxxxxxxxxxxxx>
> > > Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx>
> > > Signed-off-by: Sean Paul <seanpaul@xxxxxxxxxxxx>
> > > Link: https://patchwork.freedesktop.org/patch/msgid/20190630124735.27786-1-robdclark@xxxxxxxxx
> > > Signed-off-by: Lee Jones <lee.jones@xxxxxxxxxx>
> > > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> >
> > This commit also requires the following commits:
> >
> > commit 3de433c5b38af49a5fc7602721e2ab5d39f1e69c
> > Author: Rob Clark <robdclark@xxxxxxxxxxxx>
> > Date: Tue Jul 30 14:46:28 2019 -0700
> >
> > drm/msm: Use the correct dma_sync calls in msm_gem
> >
> > [subject was: drm/msm: shake fist angrily at dma-mapping]
> >
> > So, using dma_sync_* for our cache needs works out w/ dma iommu ops, but
> > it falls appart with dma direct ops. The problem is that, depending on
> > display generation, we can have either set of dma ops (mdp4 and dpu have
> > iommu wired to mdss node, which maps to toplevel drm device, but mdp5
> > has iommu wired up to the mdp sub-node within mdss).
> >
> > Fixes this splat on mdp5 devices:
> >
> > Unable to handle kernel paging request at virtual address ffffffff80000000
> > Mem abort info:
> > ESR = 0x96000144
> > Exception class = DABT (current EL), IL = 32 bits
> > SET = 0, FnV = 0
> > EA = 0, S1PTW = 0
> > Data abort info:
> > ISV = 0, ISS = 0x00000144
> > CM = 1, WnR = 1
> > swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000810e4000
> > [ffffffff80000000] pgd=0000000000000000
> > Internal error: Oops: 96000144 [#1] SMP
> > Modules linked in: btqcomsmd btqca bluetooth cfg80211 ecdh_generic ecc rfkill libarc4 panel_simple msm wcnss_ctrl qrtr_smd drm_kms_helper venus_enc venus_dec videobuf2_dma_sg videobuf2_memops drm venus_core ipv6 qrtr qcom_wcnss_pil v4l2_mem2mem qcom_sysmon videobuf2_v4l2 qmi_helpers videobuf2_common crct10dif_ce mdt_loader qcom_common videodev qcom_glink_smem remoteproc bmc150_accel_i2c bmc150_magn_i2c bmc150_accel_core bmc150_magn snd_soc_lpass_apq8016 snd_soc_msm8916_analog mms114 mc nf_defrag_ipv6 snd_soc_lpass_cpu snd_soc_apq8016_sbc industrialio_triggered_buffer kfifo_buf snd_soc_lpass_platform snd_soc_msm8916_digital drm_panel_orientation_quirks
> > CPU: 2 PID: 33 Comm: kworker/2:1 Not tainted 5.3.0-rc2 #1
> > Hardware name: Samsung Galaxy A5U (EUR) (DT)
> > Workqueue: events deferred_probe_work_func
> > pstate: 80000005 (Nzcv daif -PAN -UAO)
> > pc : __clean_dcache_area_poc+0x20/0x38
> > lr : arch_sync_dma_for_device+0x28/0x30


We have noticed this problem on stable-rc 4.19.118-rc1 running on arm64
qualcomm dragonboard-410c device while booting.

[ 5.474942] msm 1a00000.mdss: A306: using IOMMU
[ 5.483399] Unable to handle kernel paging request at virtual
address ffffffff80000000
[ 5.487564] Mem abort info:
[ 5.498182] ESR = 0x96000144
[ 5.507101] SET = 0, FnV = 0
[ 5.507114] EA = 0, S1PTW = 0
[ 5.509154] Data abort info:
[ 5.512253] ISV = 0, ISS = 0x00000144
[ 5.515376] CM = 1, WnR = 1
[ 5.518877] swapper pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____)
[ 5.522072] [ffffffff80000000] pgd=0000000000000000
[ 5.528819] Internal error: Oops: 96000144 [#1] PREEMPT SMP
[ 5.533491] Modules linked in: msm(+) crc32_ce adv7511 cec
mdt_loader drm_kms_helper drm drm_panel_orientation_quirks fuse
[ 5.539057] Process systemd-udevd (pid: 2807, stack limit =
0x(____ptrval____))
[ 5.550162] CPU: 0 PID: 2807 Comm: systemd-udevd Not tainted
4.19.118-rc1-00065-gb5f03cd61ab6 #1
[ 5.557366] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[ 5.566392] pstate: 80000005 (Nzcv daif -PAN -UAO)
[ 5.573079] pc : __clean_dcache_area_poc+0x20/0x38
[ 5.577676] lr : __swiotlb_sync_sg_for_device+0x74/0xa0
[ 5.582447] sp : ffff00000dcab490
[ 5.587567] x29: ffff00000dcab490 x28: 0000000000000001
[ 5.591043] x27: ffff000000d97e40 x26: ffff80003bb9e000
[ 5.596423] x25: ffff80003c14c410 x24: ffff000009066178
[ 5.601717] x23: ffff80003c14c810 x22: 0000000000000000
[ 5.607013] x21: 0000000000000001 x20: 0000000000000001
[ 5.612308] x19: ffff80003634cf80 x18: 0000000000000400
[ 5.617603] x17: 0000000000000000 x16: 0000000000000000
[ 5.622899] x15: 0000000000000400 x14: 0000000000000400
[ 5.628194] x13: 0000000000000001 x12: 0000000000000000
[ 5.633489] x11: 0000800036c50000 x10: ffff80003639fba8
[ 5.638783] x9 : 0000000000001000 x8 : ffff7e0000e7c080
[ 5.644079] x7 : 0000000000000001 x6 : 0000000000000000
[ 5.649375] x5 : 0000000000000000 x4 : ffffffff80000000
[ 5.654669] x3 : 000000000000003f x2 : 0000000000000040
[ 5.659964] x1 : ffffffff80001000 x0 : ffffffff80000000
[ 5.665260] Call trace:
[ 5.670556] __clean_dcache_area_poc+0x20/0x38
[ 5.672850] get_pages+0x1cc/0x240 [msm]
[ 5.677355] msm_gem_get_iova+0x94/0x138 [msm]
[ 5.681428] _msm_gem_kernel_new+0x40/0xb0 [msm]
[ 5.685679] msm_gem_kernel_new+0x10/0x18 [msm]
[ 5.690452] msm_gpu_init+0x300/0x568 [msm]
[ 5.694698] adreno_gpu_init+0x14c/0x268 [msm]
[ 5.698861] a3xx_gpu_init+0x7c/0x108 [msm]
[ 5.703375] adreno_bind+0x144/0x238 [msm]
[ 5.707365] component_bind_all+0x110/0x270
[ 5.711627] msm_drm_bind+0x104/0x760 [msm]
[ 5.715609] try_to_bring_up_master+0x14c/0x1a8
[ 5.719775] component_master_add_with_match+0xc0/0x100
[ 5.724388] msm_pdev_probe+0x280/0x320 [msm]
[ 5.729499] platform_drv_probe+0x50/0xa0
[ 5.734010] really_probe+0x1f4/0x290
[ 5.738003] driver_probe_device+0x54/0xe8
[ 5.741649] __driver_attach+0xe0/0xe8
[ 5.745644] bus_for_each_dev+0x70/0xb8
[ 5.749374] driver_attach+0x20/0x28
[ 5.753108] bus_add_driver+0x1a0/0x210
[ 5.756927] driver_register+0x60/0x110
[ 5.760485] __platform_driver_register+0x44/0x50
[ 5.764407] msm_drm_register+0x54/0x68 [msm]
[ 5.769169] do_one_initcall+0x54/0x154
[ 5.773509] do_init_module+0x54/0x1c8
[ 5.777152] load_module+0x1bf4/0x2190
[ 5.780972] __se_sys_finit_module+0xb8/0xc8
[ 5.784706] __arm64_sys_finit_module+0x18/0x20
[ 5.789135] el0_svc_common+0x70/0x168
[ 5.793385] el0_svc_handler+0x2c/0x80
[ 5.797204] el0_svc+0x8/0xc
[ 5.800939] Code: 9ac32042 8b010001 d1000443 8a230000 (d50b7e20)
[ 5.803980] ---[ end trace 004276cd8aee46e8 ]---

Ref:
Full test logs.
https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.117-65-gb5f03cd61ab6/testrun/1387563/log
https://lkft.validation.linaro.org/scheduler/job/1387568#L3575

kernel configs link,
https://builds.tuxbuild.com/TcvobwCBir3uhOd2MA-ndw/kernel.config

--
Linaro LKFT
https://lkft.linaro.org