Re: [PATCH v4 06/14] dma-buf/sync_file: Support (E)POLLPRI

From: Rob Clark
Date: Thu Feb 23 2023 - 13:52:06 EST


On Thu, Feb 23, 2023 at 1:38 AM Pekka Paalanen <ppaalanen@xxxxxxxxx> wrote:
>
> On Wed, 22 Feb 2023 07:37:26 -0800
> Rob Clark <robdclark@xxxxxxxxx> wrote:
>
> > On Wed, Feb 22, 2023 at 1:49 AM Pekka Paalanen <ppaalanen@xxxxxxxxx> wrote:
> > >
> > > On Tue, 21 Feb 2023 09:53:56 -0800
> > > Rob Clark <robdclark@xxxxxxxxx> wrote:
> > >
> > > > On Tue, Feb 21, 2023 at 8:48 AM Luben Tuikov <luben.tuikov@xxxxxxx> wrote:
> > > > >
> > > > > On 2023-02-20 11:14, Rob Clark wrote:
> > > > > > On Mon, Feb 20, 2023 at 12:53 AM Pekka Paalanen <ppaalanen@xxxxxxxxx> wrote:
> > > > > >>
> > > > > >> On Sat, 18 Feb 2023 13:15:49 -0800
> > > > > >> Rob Clark <robdclark@xxxxxxxxx> wrote:
> > > > > >>
> > > > > >>> From: Rob Clark <robdclark@xxxxxxxxxxxx>
> > > > > >>>
> > > > > >>> Allow userspace to use the EPOLLPRI/POLLPRI flag to indicate an urgent
> > > > > >>> wait (as opposed to a "housekeeping" wait to know when to cleanup after
> > > > > >>> some work has completed). Usermode components of GPU driver stacks
> > > > > >>> often poll() on fence fd's to know when it is safe to do things like
> > > > > >>> free or reuse a buffer, but they can also poll() on a fence fd when
> > > > > >>> waiting to read back results from the GPU. The EPOLLPRI/POLLPRI flag
> > > > > >>> lets the kernel differentiate these two cases.
> > > > > >>>
> > > > > >>> Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx>
> > > > > >>
> > > > > >> Hi,
> > > > > >>
> > > > > >> where would the UAPI documentation of this go?
> > > > > >> It seems to be missing.
> > > > > >
> > > > > > Good question, I am not sure. The poll() man page has a description,
> > > > > > but my usage doesn't fit that _exactly_ (but OTOH the description is a
> > > > > > bit vague).
> > > > > >
> > > > > >> If a Wayland compositor is polling application fences to know which
> > > > > >> client buffer to use in its rendering, should the compositor poll with
> > > > > >> PRI or not? If a compositor polls with PRI, then all fences from all
> > > > > >> applications would always be PRI. Would that be harmful somehow or
> > > > > >> would it be beneficial?
> > > > > >
> > > > > > I think a compositor would rather use the deadline ioctl and then poll
> > > > > > without PRI. Otherwise you are giving an urgency signal to the fence
> > > > > > signaller which might not necessarily be needed.
> > > > > >
> > > > > > The places where I expect PRI to be useful is more in mesa (things
> > > > > > like glFinish(), readpix, and other similar sorts of blocking APIs)
> > > > > Hi,
> > > > >
> > > > > Hmm, but then user-space could do the opposite, namely, submit work as usual--never
> > > > > using the SET_DEADLINE ioctl, and then at the end, poll using (E)POLLPRI. That seems
> > > > > like a possible usage pattern, unintended--maybe, but possible. Do we want to discourage
> > > > > this? Wouldn't SET_DEADLINE be enough? I mean, one can call SET_DEADLINE with the current
> > > > > time, and then wouldn't that be equivalent to (E)POLLPRI?
> > > >
> > > > Yeah, (E)POLLPRI isn't strictly needed if we have SET_DEADLINE. It is
> > > > slightly more convenient if you want an immediate deadline (single
> > > > syscall instead of two), but not strictly needed. OTOH it piggy-backs
> > > > on existing UABI.
> > >
> > > In that case, I would be conservative, and not add the POLLPRI
> > > semantics. An UAPI addition that is not strictly needed and somewhat
> > > unclear if it violates any design principles is best not done, until it
> > > is proven to be beneficial.
> > >
> > > Besides, a Wayland compositor does not necessary need to add the fd
> > > to its main event loop for poll. It could just SET_DEADLINE, and then
> > > when it renders simply check if the fence passed or not already. Not
> > > polling means the compositor does not need to wake up at the moment the
> > > fence signals to just record a flag.
> >
> > poll(POLLPRI) isn't intended for wayland.. but is a thing I want in
> > mesa for fence waits. I _could_ use SET_DEADLINE but it is two
> > syscalls and correspondingly more code ;-)
>
> But is it actually beneficial? "More code" seems quite irrelevant.
>
> Would there be a hundred or more of those per frame? Or would it be
> always limited to one or two? Or totally depend on what the application
> is doing? Is it a significant impact?

In general, any time the CPU is waiting on the GPU, you have already
lost. So I don't think the extra syscall is too much of a problem.
Just less convenient.

> > > On another matter, if the application uses SET_DEADLINE with one
> > > timestamp, and the compositor uses SET_DEADLINE on the same thing with
> > > another timestamp, what should happen?
> >
> > The expectation is that many deadline hints can be set on a fence.
> > The fence signaller should track the soonest deadline.
>
> You need to document that as UAPI, since it is observable to userspace.
> It would be bad if drivers or subsystems would differ in behaviour.
>

It is in the end a hint. It is about giving the driver more
information so that it can make better choices. But the driver is
even free to ignore it. So maybe "expectation" is too strong of a
word. Rather, any other behavior doesn't really make sense. But it
could end up being dictated by how the hw and/or fw works.

BR,
-R

>
> Thanks,
> pq