Re: [PATCH] xtensa: Fix mmap() support

From: Boris Brezillon
Date: Thu Mar 30 2017 - 04:31:51 EST


+Christoph who introduced the CONFIG_ARCH_NO_COHERENT_DMA_MMAP option.

Hi Max,

On Wed, 29 Mar 2017 15:48:24 -0700
Max Filippov <jcmvbkbc@xxxxxxxxx> wrote:

> Hi Boris,
>
> On Tue, Mar 28, 2017 at 1:20 AM, Boris Brezillon
> <boris.brezillon@xxxxxxxxxxxxxxxxxx> wrote:
> > The xtensa architecture does not implement the dma_map_ops->mmap() hook,
> > thus relying on the default dma_common_mmap() implementation.
> > This implementation is only safe on DMA-coherent architecture (hence the
> > !defined(CONFIG_ARCH_NO_COHERENT_DMA_MMAP) condition), but xtensa does not
> > fall in this case.
>
> Why not? AFAIU "DMA-coherent" may mean "having DMA memory accessible
> w/o caching", is that right?

I think I misunderstood what CONFIG_ARCH_NO_COHERENT_DMA_MMAP means.
My understanding was that, if CONFIG_ARCH_NO_COHERENT_DMA_MMAP is not
set, the architecture is assumed to be cache-coherent, but re-reading
the option name I realize you're probably right.

> We have all low memory mapped in the uncached
> KSEG region, it may be mapped to the userspace w/o caching as well, that's
> what dma_common_mmap does.
>
> > This lead to bad pfn calculation when someone tries to mmap() one or
> > several pages that are not part of the identity mapping because the
> > dma_common_mmap() extract the pfn value from the virt address using
> > virt_to_page() which is only applicable on DMA-coherent platforms (on
> > other platforms, DMA coherent pages are mapped in a different region).
>
> Oh, yes. Our __pa implementation is buggy in a sense that it doesn't work
> correctly with addresses from the uncached KSEG.

I'm new to the xtensa architecture, and on ARM, virt_to_phys(), __pa()
and co are known to be unsafe when used on DMA-coherent memory.
Here it seems that you have twice the identity mapping in your virtual
address space: once with the uncached flag set, and another one with
cache activated, so it is indeed easier to patch __pa() to do the right
thing.

>
> > Implement xtensa_dma_mmap() (loosely based on __arm_dma_mmap()) in which
> > pfn is extracted from the DMA address using PFN_DOWN().
> >
> > While we're at it, select ARCH_NO_COHERENT_DMA_MMAP from the XTENSA
> > entry so that we never silently fallback to dma_common_mmap() if someone
> > decides to drop the xtensa_dma_mmap() implementation.
>
> I don't think this is right.

Yep, as said above, you're probably right.

>
> > Signed-off-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxxxxxxx>
> > ---
> > Hello,
> >
> > This bug has been detected while developping a DRM driver on an FPGA
> > containing an Xtensa CPU. The DRM driver is using the generic CMA GEM
> > implementation which is allocating DMA coherent buffers in kernel space
> > and then allows userspace programs to mmap() these buffers.
> >
> > Whith the existing implementation, the userspace pointer was pointing
> > to a completely different physical region, thus leading to bad display
> > output and memory corruptions.
> >
> > I'm not sure the xtensa_dma_mmap() implementation is correct, but it
> > seems to solve my problem.
> >
> > Don't hesitate to propose a different implementation.
>
> Could you please instead check if the dma_common_mmap works for you
> with the attached patch?

I will. BTW, shouldn't it be

if (off >= XCHAL_KSEG_SIZE)

instead of

if (off > XCHAL_KSEG_SIZE)
?

>
> [...]
>
> > diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c
> > index 70e362e6038e..8f3ef49ceba6 100644
> > --- a/arch/xtensa/kernel/pci-dma.c
> > +++ b/arch/xtensa/kernel/pci-dma.c
> > @@ -249,9 +249,32 @@ int xtensa_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
> > return 0;
> > }
> >
> > +static int xtensa_dma_mmap(struct device *dev, struct vm_area_struct *vma,
> > + void *cpu_addr, dma_addr_t dma_addr, size_t size,
> > + unsigned long attrs)
> > +{
> > + int ret = -ENXIO;
> > +#ifdef CONFIG_MMU
> > + unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
> > + unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > + unsigned long pfn = PFN_DOWN(dma_addr);
> > + unsigned long off = vma->vm_pgoff;
> > +
> > + if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret))
> > + return ret;
> > +
> > + if (off < nr_pages && nr_vma_pages <= (nr_pages - off))
> > + ret = remap_pfn_range(vma, vma->vm_start, pfn + off,
> > + vma->vm_end - vma->vm_start,
> > + vma->vm_page_prot);
>
> You use vma->vm_page_prot directly here, isn't it required to do
>
> vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>
> to make the mapping uncached? Because otherwise I guess you
> get a mapping with cache which on Xtensa is not going to be
> coherent.

Indeed. Anyway, we'll probably keep relying on dma_common_mmap() which
is already doing the right thing.

Thanks for the review.

Regards,

Boris