Re: [PATCH 2/2] media: docs-rst: Document memory-to-memory video encoder interface

From: Tomasz Figa
Date: Mon Oct 22 2018 - 00:58:09 EST


On Tue, Oct 16, 2018 at 10:50 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>
> On 10/16/18 09:36, Tomasz Figa wrote:
> > On Tue, Aug 7, 2018 at 3:54 PM Tomasz Figa <tfiga@xxxxxxxxxxxx> wrote:
> >>>> + * The driver must expose following selection targets on ``OUTPUT``:
> >>>> +
> >>>> + ``V4L2_SEL_TGT_CROP_BOUNDS``
> >>>> + maximum crop bounds within the source buffer supported by the
> >>>> + encoder
> >>>> +
> >>>> + ``V4L2_SEL_TGT_CROP_DEFAULT``
> >>>> + suggested cropping rectangle that covers the whole source picture
> >>>> +
> >>>> + ``V4L2_SEL_TGT_CROP``
> >>>> + rectangle within the source buffer to be encoded into the
> >>>> + ``CAPTURE`` stream; defaults to ``V4L2_SEL_TGT_CROP_DEFAULT``
> >>>> +
> >>>> + ``V4L2_SEL_TGT_COMPOSE_BOUNDS``
> >>>> + maximum rectangle within the coded resolution, which the cropped
> >>>> + source frame can be output into; always equal to (0, 0)x(width of
> >>>> + ``V4L2_SEL_TGT_CROP``, height of ``V4L2_SEL_TGT_CROP``), if the
> >>>> + hardware does not support compose/scaling
>
> Re-reading this I would rewrite this a bit:
>
> if the hardware does not support composition or scaling, then this is always
> equal to (0, 0)x(width of ``V4L2_SEL_TGT_CROP``, height of ``V4L2_SEL_TGT_CROP``).
>

Ack.

> >>>> +
> >>>> + ``V4L2_SEL_TGT_COMPOSE_DEFAULT``
> >>>> + equal to ``V4L2_SEL_TGT_CROP``
> >>>> +
> >>>> + ``V4L2_SEL_TGT_COMPOSE``
> >>>> + rectangle within the coded frame, which the cropped source frame
> >>>> + is to be output into; defaults to
> >>>> + ``V4L2_SEL_TGT_COMPOSE_DEFAULT``; read-only on hardware without
> >>>> + additional compose/scaling capabilities; resulting stream will
> >>>> + have this rectangle encoded as the visible rectangle in its
> >>>> + metadata
> >>>> +
> >>>> + ``V4L2_SEL_TGT_COMPOSE_PADDED``
> >>>> + always equal to coded resolution of the stream, as selected by the
> >>>> + encoder based on source resolution and crop/compose rectangles
> >>>
> >>> Are there codec drivers that support composition? I can't remember seeing any.
> >>>
> >>
> >> Hmm, I was convinced that MFC could scale and we just lacked support
> >> in the driver, but I checked the documentation and it doesn't seem to
> >> be able to do so. I guess we could drop the COMPOSE rectangles for
> >> now, until we find any hardware that can do scaling or composing on
> >> the fly.
> >>
> >
> > On the other hand, having them defined already wouldn't complicate
> > existing drivers too much either, because they would just handle all
> > of them in the same switch case, i.e.
> >
> > case V4L2_SEL_TGT_COMPOSE_BOUNDS:
> > case V4L2_SEL_TGT_COMPOSE_DEFAULT:
> > case V4L2_SEL_TGT_COMPOSE:
> > case V4L2_SEL_TGT_COMPOSE_PADDED:
> > return visible_rectangle;
> >
> > That would need one change, though. We would define
> > V4L2_SEL_TGT_COMPOSE_DEFAULT to be equal to (0, 0)x(width of
> > V4L2_SEL_TGT_CROP - 1, height of ``V4L2_SEL_TGT_CROP - 1), which
>
> " - 1"? Where does that come from?
>
> Usually rectangles are specified as widthxheight@left,top.
>

Yeah, the notation I used was quite unfortunate. How about just making
it fully verbose?

if the hardware does not support
composition or scaling, then this is always equal to the rectangle of
width and height matching ``V4L2_SEL_TGT_CROP`` and located at (0, 0)

> > makes more sense than current definition, since it would bypass any
> > compose/scaling by default.
>
> I have no problem with drivers optionally implementing these rectangles,
> even if they don't do scaling or composition. The question is, should it
> be required for decoders? If there is a good reason, then I'm OK with it.

There is no particular reason to do it for existing drivers. I'm
personally not a big fan of making things optional, since you never
know when something becomes required and then you run into problems
with user space compatibility. In this case the cost of having those
rectangles defined is really low and they will be useful to handle
encoders and decoders with scaling/compose ability in the future.

I have no strong opinion though.

Best regards,
Tomasz