Re: dmaengine: pl330 rare NULL pointer dereference in pl330_tasklet

From: Krzysztof Kozlowski
Date: Mon Nov 02 2020 - 03:41:30 EST


On Mon, Nov 02, 2020 at 08:38:14AM +0100, Marek Szyprowski wrote:
> Hi Krzysztof,
>
> On 31.10.2020 20:01, Krzysztof Kozlowski wrote:
> > I hit quite rare issue with pl330 DMA driver, difficult to reproduce
> > (actually failed to do so):
> >
> > Happened during early reboot
> >
> > [ OK ] Stopped target Graphical Interface.
> > [ OK ] Stopped target Multi-User System.
> > [ OK ] Stopped target RPC Port Mapper.
> > Stopping OpenSSH Daemonti[ 75.447904] 8<--- cut here ---
> > [ 75.449506] Unable to handle kernel NULL pointer dereference at virtual address 0000000c
> > ...
> > [ 75.690850] [<c0902f70>] (pl330_tasklet) from [<c034d460>] (tasklet_action_common+0x88/0x1f4)
> > [ 75.699340] [<c034d460>] (tasklet_action_common) from [<c03013f8>] (__do_softirq+0x108/0x428)
> > [ 75.707850] [<c03013f8>] (__do_softirq) from [<c034dadc>] (run_ksoftirqd+0x2c/0x4c)
> > [ 75.715486] [<c034dadc>] (run_ksoftirqd) from [<c036fbfc>] (smpboot_thread_fn+0x13c/0x24c)
> > [ 75.723693] [<c036fbfc>] (smpboot_thread_fn) from [<c036c18c>] (kthread+0x13c/0x16c)
> > [ 75.731390] [<c036c18c>] (kthread) from [<c03001a8>] (ret_from_fork+0x14/0x2c)
> >
> > Full log:
> > https://protect2.fireeye.com/v1/url?k=7445a1ab-2bde98a7-74442ae4-000babff3563-a368d542db0c5500&q=1&e=62e4887b-e224-48e5-80a2-71163caeeec8&u=https%3A%2F%2Fkrzk.eu%2F%23%2Fbuilders%2F20%2Fbuilds%2F954%2Fsteps%2F22%2Flogs%2Fserial0
> >
> > 1. Arch ARM Linux
> > 2. multi_v7_defconfig
> > 3. Odroid HC1, ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC
> > 4. systemd, boot up with static IP set in kernel command line
> > 5. No swap
> > 6. Kernel, DTB and initramfs are downloaded with TFTP
> > 7. NFS root (NFS client) mounted from a NFSv4 server
> >
> > Since I was not able to reproduce it, obviously I did not run bisect. If
> > anyone has ideas, please share.
>
> Well, I've also observed it a few times. IMHO it is related to the
> broken UART (in DMA mode) shutdown procedure. Usually it can be easily
> observed by flushing some random parts of the previously transmitted
> data to the UART console during the system shutdown. This also depends
> on the board and used system (especially the presence of systemd, which
> plays with UART differently than the old sysv init). IMHO there is a
> kind of use-after-free issue there, so the above pl330 stacktrace can be
> also observed depending on the timing and system load. This issue is
> there from the beginning of the DMA support. I have it on my todo list,
> but it had too low priority to take a look into it. I only briefly
> checked the related code a few years ago and noticed that the UART
> shutdown is not really synchronized with DMA. However that time I didn't
> find any simple fix, so I gave up.

Thanks for the explanation.

Best regards,
Krzysztof