Re: [PATCH 05/28] media/v4l2: remove V4L2-FLAG-MEMORY-NON-CONSISTENT

From: Tomasz Figa
Date: Wed Aug 19 2020 - 08:58:50 EST


On Wed, Aug 19, 2020 at 1:51 PM Robin Murphy <robin.murphy@xxxxxxx> wrote:
>
> Hi Tomasz,
>
> On 2020-08-19 12:16, Tomasz Figa wrote:
> > Hi Christoph,
> >
> > On Wed, Aug 19, 2020 at 8:56 AM Christoph Hellwig <hch@xxxxxx> wrote:
> >>
> >> The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused,
> >
> > Could you explain what makes you think it's unused? It's a feature of
> > the UAPI generally supported by the videobuf2 framework and relied on
> > by Chromium OS to get any kind of reasonable performance when
> > accessing V4L2 buffers in the userspace.
> >
> >> and causes
> >> weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is
> >> unimplemented except on PARISC and some MIPS configs, and about to be
> >> removed.
> >
> > It is implemented by the generic DMA mapping layer [1], which is used
> > by a number of architectures including ARM64 and supposed to be used
> > by new architectures going forward.
>
> AFAICS all that V4L2_FLAG_MEMORY_NON_CONSISTENT does is end up
> controling whether DMA_ATTR_NON_CONSISTENT is added to vb2_queue::dma_attrs.
>
> Please can you point to where DMA_ATTR_NON_CONSISTENT does anything at
> all on arm64?
>

With the default config it doesn't, but with
CONFIG_DMA_NONCOHERENT_CACHE_SYNC enabled it makes dma_pgprot() keep
the pgprot value as is, without enforcing coherence attributes.


> Also, I posit that videobuf2 is not actually relying on
> DMA_ATTR_NON_CONSISTENT anyway, since it's clearly not using it properly:
>
> "By using this API, you are guaranteeing to the platform
> that you have all the correct and necessary sync points for this memory
> in the driver should it choose to return non-consistent memory."
>
> $ git grep dma_cache_sync drivers/media
> $

AFAIK dma_cache_sync() isn't the only way to perform the cache
synchronization. The earlier patch series that I reviewed relied on
dma_get_sgtable() and then dma_sync_sg_*() (which existed in the
vb2-dc since forever [1]). However, it looks like with the final code
the sgtable isn't acquired and the synchronization isn't happening, so
you have a point.

FWIW, I asked back in time what the plan is for non-coherent
allocations and it seemed like DMA_ATTR_NON_CONSISTENT and
dma_sync_*() was supposed to be the right thing to go with. [2] The
same thread also explains why dma_alloc_pages() isn't suitable for the
users of dma_alloc_attrs() and DMA_ATTR_NON_CONSISTENT.

I think we could make a deal here. We could revert back the parts
using DMA_ATTR_NON_CONSISTENT, keeping the UAPI intact, but just
rendering it no-op, since it's just a hint after all. Then, you would
propose a proper, functionally equivalent and working for ARM64,
replacement for dma_alloc_attrs(..., DMA_ATTR_NON_CONSISTENT), which
we could then use to enable the functionality expected by this UAPI.
Does it sound like something that could work as a way forward here?

By the way, as a videobuf2 reviewer, I'd appreciate being CC'd on any
series related to the subsystem-facing DMA API changes, since
videobuf2 is one of the biggest users of it.

[1] https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/media/common/videobuf2/videobuf2-dma-contig.c#L98
[2] https://patchwork.kernel.org/comment/23312203/

Best regards,
Tomasz


>
> Robin.
>
> > [1] https://elixir.bootlin.com/linux/v5.9-rc1/source/kernel/dma/mapping.c#L341
> >
> > When removing features from generic kernel code, I'd suggest first
> > providing viable alternatives for its users, rather than killing the
> > users altogether.
> >
> > Given the above, I'm afraid I have to NAK this.
> >
> > Best regards,
> > Tomasz
> >
> >>
> >> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> >> ---
> >> .../userspace-api/media/v4l/buffer.rst | 17 ---------
> >> .../media/v4l/vidioc-reqbufs.rst | 1 -
> >> .../media/common/videobuf2/videobuf2-core.c | 36 +------------------
> >> .../common/videobuf2/videobuf2-dma-contig.c | 19 ----------
> >> .../media/common/videobuf2/videobuf2-dma-sg.c | 3 +-
> >> .../media/common/videobuf2/videobuf2-v4l2.c | 12 -------
> >> include/media/videobuf2-core.h | 3 +-
> >> include/uapi/linux/videodev2.h | 2 --
> >> 8 files changed, 3 insertions(+), 90 deletions(-)
> >>
> >> diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
> >> index 57e752aaf414a7..2044ed13cd9d7d 100644
> >> --- a/Documentation/userspace-api/media/v4l/buffer.rst
> >> +++ b/Documentation/userspace-api/media/v4l/buffer.rst
> >> @@ -701,23 +701,6 @@ Memory Consistency Flags
> >> :stub-columns: 0
> >> :widths: 3 1 4
> >>
> >> - * .. _`V4L2-FLAG-MEMORY-NON-CONSISTENT`:
> >> -
> >> - - ``V4L2_FLAG_MEMORY_NON_CONSISTENT``
> >> - - 0x00000001
> >> - - A buffer is allocated either in consistent (it will be automatically
> >> - coherent between the CPU and the bus) or non-consistent memory. The
> >> - latter can provide performance gains, for instance the CPU cache
> >> - sync/flush operations can be avoided if the buffer is accessed by the
> >> - corresponding device only and the CPU does not read/write to/from that
> >> - buffer. However, this requires extra care from the driver -- it must
> >> - guarantee memory consistency by issuing a cache flush/sync when
> >> - consistency is needed. If this flag is set V4L2 will attempt to
> >> - allocate the buffer in non-consistent memory. The flag takes effect
> >> - only if the buffer is used for :ref:`memory mapping <mmap>` I/O and the
> >> - queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
> >> - <V4L2-BUF-CAP-SUPPORTS-MMAP-CACHE-HINTS>` capability.
> >> -
> >> .. c:type:: v4l2_memory
> >>
> >> enum v4l2_memory
> >> diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> >> index 75d894d9c36c42..3180c111d368ee 100644
> >> --- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> >> +++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> >> @@ -169,7 +169,6 @@ aborting or finishing any DMA in progress, an implicit
> >> - This capability is set by the driver to indicate that the queue supports
> >> cache and memory management hints. However, it's only valid when the
> >> queue is used for :ref:`memory mapping <mmap>` streaming I/O. See
> >> - :ref:`V4L2_FLAG_MEMORY_NON_CONSISTENT <V4L2-FLAG-MEMORY-NON-CONSISTENT>`,
> >> :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>` and
> >> :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>`.
> >>
> >> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> >> index f544d3393e9d6b..66a41cef33c1b1 100644
> >> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> >> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> >> @@ -721,39 +721,14 @@ int vb2_verify_memory_type(struct vb2_queue *q,
> >> }
> >> EXPORT_SYMBOL(vb2_verify_memory_type);
> >>
> >> -static void set_queue_consistency(struct vb2_queue *q, bool consistent_mem)
> >> -{
> >> - q->dma_attrs &= ~DMA_ATTR_NON_CONSISTENT;
> >> -
> >> - if (!vb2_queue_allows_cache_hints(q))
> >> - return;
> >> - if (!consistent_mem)
> >> - q->dma_attrs |= DMA_ATTR_NON_CONSISTENT;
> >> -}
> >> -
> >> -static bool verify_consistency_attr(struct vb2_queue *q, bool consistent_mem)
> >> -{
> >> - bool queue_is_consistent = !(q->dma_attrs & DMA_ATTR_NON_CONSISTENT);
> >> -
> >> - if (consistent_mem != queue_is_consistent) {
> >> - dprintk(q, 1, "memory consistency model mismatch\n");
> >> - return false;
> >> - }
> >> - return true;
> >> -}
> >> -
> >> int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> >> unsigned int flags, unsigned int *count)
> >> {
> >> unsigned int num_buffers, allocated_buffers, num_planes = 0;
> >> unsigned plane_sizes[VB2_MAX_PLANES] = { };
> >> - bool consistent_mem = true;
> >> unsigned int i;
> >> int ret;
> >>
> >> - if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
> >> - consistent_mem = false;
> >> -
> >> if (q->streaming) {
> >> dprintk(q, 1, "streaming active\n");
> >> return -EBUSY;
> >> @@ -765,8 +740,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> >> }
> >>
> >> if (*count == 0 || q->num_buffers != 0 ||
> >> - (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) ||
> >> - !verify_consistency_attr(q, consistent_mem)) {
> >> + (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory)) {
> >> /*
> >> * We already have buffers allocated, so first check if they
> >> * are not in use and can be freed.
> >> @@ -803,7 +777,6 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> >> num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
> >> memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
> >> q->memory = memory;
> >> - set_queue_consistency(q, consistent_mem);
> >>
> >> /*
> >> * Ask the driver how many buffers and planes per buffer it requires.
> >> @@ -894,12 +867,8 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
> >> {
> >> unsigned int num_planes = 0, num_buffers, allocated_buffers;
> >> unsigned plane_sizes[VB2_MAX_PLANES] = { };
> >> - bool consistent_mem = true;
> >> int ret;
> >>
> >> - if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
> >> - consistent_mem = false;
> >> -
> >> if (q->num_buffers == VB2_MAX_FRAME) {
> >> dprintk(q, 1, "maximum number of buffers already allocated\n");
> >> return -ENOBUFS;
> >> @@ -912,15 +881,12 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
> >> }
> >> memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
> >> q->memory = memory;
> >> - set_queue_consistency(q, consistent_mem);
> >> q->waiting_for_buffers = !q->is_output;
> >> } else {
> >> if (q->memory != memory) {
> >> dprintk(q, 1, "memory model mismatch\n");
> >> return -EINVAL;
> >> }
> >> - if (!verify_consistency_attr(q, consistent_mem))
> >> - return -EINVAL;
> >> }
> >>
> >> num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
> >> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> >> index ec3446cc45b8da..7b1b86ec942d7d 100644
> >> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> >> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> >> @@ -42,11 +42,6 @@ struct vb2_dc_buf {
> >> struct dma_buf_attachment *db_attach;
> >> };
> >>
> >> -static inline bool vb2_dc_buffer_consistent(unsigned long attr)
> >> -{
> >> - return !(attr & DMA_ATTR_NON_CONSISTENT);
> >> -}
> >> -
> >> /*********************************************/
> >> /* scatterlist table functions */
> >> /*********************************************/
> >> @@ -341,13 +336,6 @@ static int
> >> vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
> >> enum dma_data_direction direction)
> >> {
> >> - struct vb2_dc_buf *buf = dbuf->priv;
> >> - struct sg_table *sgt = buf->dma_sgt;
> >> -
> >> - if (vb2_dc_buffer_consistent(buf->attrs))
> >> - return 0;
> >> -
> >> - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
> >> return 0;
> >> }
> >>
> >> @@ -355,13 +343,6 @@ static int
> >> vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
> >> enum dma_data_direction direction)
> >> {
> >> - struct vb2_dc_buf *buf = dbuf->priv;
> >> - struct sg_table *sgt = buf->dma_sgt;
> >> -
> >> - if (vb2_dc_buffer_consistent(buf->attrs))
> >> - return 0;
> >> -
> >> - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
> >> return 0;
> >> }
> >>
> >> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> >> index 0a40e00f0d7e5c..a86fce5d8ea8bf 100644
> >> --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> >> +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> >> @@ -123,8 +123,7 @@ static void *vb2_dma_sg_alloc(struct device *dev, unsigned long dma_attrs,
> >> /*
> >> * NOTE: dma-sg allocates memory using the page allocator directly, so
> >> * there is no memory consistency guarantee, hence dma-sg ignores DMA
> >> - * attributes passed from the upper layer. That means that
> >> - * V4L2_FLAG_MEMORY_NON_CONSISTENT has no effect on dma-sg buffers.
> >> + * attributes passed from the upper layer.
> >> */
> >> buf->pages = kvmalloc_array(buf->num_pages, sizeof(struct page *),
> >> GFP_KERNEL | __GFP_ZERO);
> >> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> >> index 30caad27281e1a..de83ad48783821 100644
> >> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
> >> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> >> @@ -722,20 +722,11 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
> >> #endif
> >> }
> >>
> >> -static void clear_consistency_attr(struct vb2_queue *q,
> >> - int memory,
> >> - unsigned int *flags)
> >> -{
> >> - if (!q->allow_cache_hints || memory != V4L2_MEMORY_MMAP)
> >> - *flags &= ~V4L2_FLAG_MEMORY_NON_CONSISTENT;
> >> -}
> >> -
> >> int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
> >> {
> >> int ret = vb2_verify_memory_type(q, req->memory, req->type);
> >>
> >> fill_buf_caps(q, &req->capabilities);
> >> - clear_consistency_attr(q, req->memory, &req->flags);
> >> return ret ? ret : vb2_core_reqbufs(q, req->memory,
> >> req->flags, &req->count);
> >> }
> >> @@ -769,7 +760,6 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
> >> unsigned i;
> >>
> >> fill_buf_caps(q, &create->capabilities);
> >> - clear_consistency_attr(q, create->memory, &create->flags);
> >> create->index = q->num_buffers;
> >> if (create->count == 0)
> >> return ret != -EBUSY ? ret : 0;
> >> @@ -998,7 +988,6 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
> >> int res = vb2_verify_memory_type(vdev->queue, p->memory, p->type);
> >>
> >> fill_buf_caps(vdev->queue, &p->capabilities);
> >> - clear_consistency_attr(vdev->queue, p->memory, &p->flags);
> >> if (res)
> >> return res;
> >> if (vb2_queue_is_busy(vdev, file))
> >> @@ -1021,7 +1010,6 @@ int vb2_ioctl_create_bufs(struct file *file, void *priv,
> >>
> >> p->index = vdev->queue->num_buffers;
> >> fill_buf_caps(vdev->queue, &p->capabilities);
> >> - clear_consistency_attr(vdev->queue, p->memory, &p->flags);
> >> /*
> >> * If count == 0, then just check if memory and type are valid.
> >> * Any -EBUSY result from vb2_verify_memory_type can be mapped to 0.
> >> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> >> index 52ef92049073e3..4c7f25b07e9375 100644
> >> --- a/include/media/videobuf2-core.h
> >> +++ b/include/media/videobuf2-core.h
> >> @@ -744,8 +744,7 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
> >> * vb2_core_reqbufs() - Initiate streaming.
> >> * @q: pointer to &struct vb2_queue with videobuf2 queue.
> >> * @memory: memory type, as defined by &enum vb2_memory.
> >> - * @flags: auxiliary queue/buffer management flags. Currently, the only
> >> - * used flag is %V4L2_FLAG_MEMORY_NON_CONSISTENT.
> >> + * @flags: auxiliary queue/buffer management flags.
> >> * @count: requested buffer count.
> >> *
> >> * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
> >> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> >> index c7b70ff53bc1dd..5c00f63d9c1b58 100644
> >> --- a/include/uapi/linux/videodev2.h
> >> +++ b/include/uapi/linux/videodev2.h
> >> @@ -191,8 +191,6 @@ enum v4l2_memory {
> >> V4L2_MEMORY_DMABUF = 4,
> >> };
> >>
> >> -#define V4L2_FLAG_MEMORY_NON_CONSISTENT (1 << 0)
> >> -
> >> /* see also http://vektor.theorem.ca/graphics/ycbcr/ */
> >> enum v4l2_colorspace {
> >> /*
> >> --
> >> 2.28.0
> >>
> >> _______________________________________________
> >> iommu mailing list
> >> iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx
> >> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >