Re: [RFC][PATCH 13/15] videobuf2: do not sync buffers for DMABUF queues

From: Hans Verkuil
Date: Tue Jan 28 2020 - 03:47:51 EST


On 1/28/20 8:19 AM, Tomasz Figa wrote:
> On Thu, Jan 23, 2020 at 8:35 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>>
>> On 1/22/20 6:05 AM, Sergey Senozhatsky wrote:
>>> On (20/01/10 11:30), Hans Verkuil wrote:
>>> [..]
>>>>> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
>>>>> index 1762849288ae..2b9d3318e6fb 100644
>>>>> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
>>>>> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
>>>>> @@ -341,8 +341,22 @@ static void set_buffer_cache_hints(struct vb2_queue *q,
>>>>> struct vb2_buffer *vb,
>>>>> struct v4l2_buffer *b)
>>>>> {
>>>>> - vb->need_cache_sync_on_prepare = 1;
>>>>> + /*
>>>>> + * DMA exporter should take care of cache syncs, so we can avoid
>>>>> + * explicit ->prepare()/->finish() syncs.
>>>>> + */
>>>>> + if (q->memory == VB2_MEMORY_DMABUF) {
>>>>> + vb->need_cache_sync_on_finish = 0;
>>>>> + vb->need_cache_sync_on_prepare = 0;
>>>>> + return;
>>>>> + }
>>>>>
>>>>> + /*
>>>>> + * For other ->memory types we always need ->prepare() cache
>>>>> + * sync. ->finish() cache sync, however, can be avoided when queue
>>>>> + * direction is TO_DEVICE.
>>>>> + */
>>>>> + vb->need_cache_sync_on_prepare = 1;
>>>>
>>>> I'm trying to remember: what needs to be done in prepare()
>>>> for a capture buffer? I thought that for capture you only
>>>> needed to invalidate the cache in finish(), but nothing needs
>>>> to be done in the prepare().
>>>
>>> Hmm. Not sure. A precaution in case if user-space wrote to that buffer?
>>
>> But whatever was written in the buffer is going to be overwritten anyway.
>>
>> Unless I am mistaken the current situation is that the cache syncs are done
>> in both prepare and finish, regardless of the DMA direction.
>>
>> I would keep that behavior to avoid introducing any unexpected regressions.
>>
>
> It wouldn't be surprising if the buffer was first filled with default
> values (e.g. all zeroes) on the CPU. That would make the cache lines
> dirty and they could overwrite what the device writes. So we need to
> flush (aka clean) the write-back caches on prepare for CAPTURE
> buffers.

Very true, I'd forgotten about that. This should be documented in the userspace
documentation. And possible in this function as well.

I think these issues are complex enough that there is no such things as too
much documentation :-)

>
>> Then, if q->allow_cache_hint is set, then default to a cache sync (cache clean)
>> in the prepare for OUTPUT buffers and a cache sync (cache invalidate) in the
>> finish for CAPTURE buffers.
>
> I'd still default to the existing behavior even if allow_cache_hint is
> set, because of what I wrote above. Then if the userspace doesn't ever
> write to the buffers, it can request no flush/clean by setting the
> V4L2_BUF_FLAG_NO_CACHE_CLEAN flag when queuing the buffer.
>
>>
>> This also means that any drivers that want to access a buffer in between the
>> prepare...finish calls will need to do a begin/end_cpu_access. But that's a
>> separate matter.
>
> AFAIR with current design of the series, the drivers can opt-in for
> userspace cache sync hints, so by default even if the userspace
> requests sync to be skipped, it wouldn't have any effect unless the
> driver allows so. Then I'd imagine migrating all the drivers to
> request clean/invalidate explicitly. Note that the DMA-buf
> begin/end_cpu_access isn't enough here. We'd need something like
> vb2_begin/end_cpu_access() which also takes care of syncing
> inconsistent MMAP and USERPTR buffers.

I did just that in my old git branch for this:

https://git.linuxtv.org/hverkuil/media_tree.git/log/?h=vb2-cpu-access

Regards,

Hans

>
>>
>> Regards,
>>
>> Hans
>>
>>>
>>> + if (q->dma_dir == DMA_FROM_DEVICE)
>>> + q->need_cache_sync_on_prepare = 0;
>>>
>>> ?
>>>
>>> -ss
>>>
>>