Re: [PATCH 2/6] gpu: host1x: Fix syncpoint wait return value

From: Daniel Vetter
Date: Wed Jun 12 2013 - 07:00:38 EST

On Wed, Jun 12, 2013 at 12:28 PM, Terje Bergström <tbergstrom@xxxxxxxxxx> wrote:
> On 11.06.2013 15:09, Daniel Vetter wrote:
>> Maybe it wasn't clear, but -EAGAIN does _not_ resubmit work. -EAGAIN
>> is used to restart the ioctl if we had to kick a thread (to make sure
>> it doesn't hold any locks), e.g. for a blocking wait on oustanding
>> rendering. The codepaths taken work exactly as if the thread is
>> interrupt with a signal.
> You did make it clear that there's no resubmission, but other parts
> confused me.
> So this is used so that a legacy driver which does not do fine-grained
> locking can interrupt all waits for completion for a wedged submit. This
> way a driver-wide lock get unlocked, cleanup code acquires locks, does
> the magic to unwedge GPU, and unlocks. Then user space can re-submit the
> waits as it got -EAGAIN.

I think this is not just for drivers without fine-grained locking, at
least I expect that we'll keep the same mechanism when switching over
to per-object locking - we simply have too many places where a thread
could arbitrarily block while holding locks that the gpu reset handler
also needs to grab. You could of course restructure the code massively
and drop all locks while waiting, but that means adding tons of
special-purpose code which is only really exercises when a gpu hang
occurs. Our approach with the ioctl restart otoh reuses codepaths
which are all heavily used by X (due to X constantly getting interrupt
by timers and input events).
