Re: more intel drm issues (was Re: [git pull] drm intel only fixes)

From: Chris Wilson
Date: Thu Jan 20 2011 - 12:38:17 EST


On Thu, 20 Jan 2011 08:07:02 -0800, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Jan 20, 2011 at 2:25 AM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Right, the autoreported HEAD may have been already reset to 0 and so hit
> > the wraparound bug which caused it to exit early without actually
> > quiescing the ringbuffer.
>
> Yeah, that would explain the issue.
>
> > Another possibility is that I added a 3s timeout waiting for a request if
> > IRQs were suspended:
>
> No, if IRQ's are actually suspended here, then that codepath is
> totally buggy and would blow up (msleep() doesn't work, and jiffies
> wouldn't advance on UP). So that's not it.
>
> > Both of those I think are symptoms of another problem, that perhaps during
> > suspend we are shutting down parts of the chip before idling?
>
> That could be, but looking at the code, one thing strikes me: the
> _normal_ case (of just waiting for "enough space" in the ring buffer)
> doesn't need to use the exact case, but the "wait for ring buffer to
> be totally empty" does.
>
> Which means that the use of the "fast-but-inaccurate" 'head' sounds
> wrong for the "wait for idle" case.
>
> So can you explain the difference between
>
> intel_read_status_page(ring, 4);
>
> vs
>
> I915_READ_HEAD(ring);

For I915_READ_HEAD, we need to wake up the GT power well, perform an
uncached read from the register, and then power down. This takes on the
order of a 100 microseconds (less if the GT is already powered up, etc).

Instead a read from the status page is from cached memory. The caveat here
is that value is only updated by the gfx engine when its HEAD crosses
every 64k boundary. So quite rarely.

> because from looking at the code, I get the notion that
> "intel_read_status_page()" may not be exact. But what happens if that
> inexact value matches our cached ring->actual_head, so we never even
> try to read the exact case? Does it _stay_ inexact for arbitrarily
> long times? If so, we might wait for the ring to empty forever (well,
> until the timeout - the behavior I see), even though the ring really
> _is_ empty. No?

Ah. Your analysis is spot on and this will cause a hang whilst polling if
we enter the loop with the last known head the same as the reported value.

> Also, isn't that "head < ring->actual_head" buggy? What about the
> overflow case? Not that we care, because afaik, 'actual_head' is not
> actually used anywhere, so it should be called 'pointless_head'?

This is the one case that I think is handled correctly, ignoring all the
other bugs.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/