Re: [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed...GPU hung

From: Justin P. Mattock
Date: Thu Oct 25 2012 - 01:22:46 EST




On Tue, Oct 23, 2012 at 10:06:52AM -0700, Justin P. Mattock wrote:
> This is happening both with MAINLINE and NEXT.
>
> basically system is running fine, then under load system becomes
> really sluggish and unresponsive. I was able to get dmesg of the
> error..:
>
> [ 7745.007008] ath9k 0000:05:00.0 wlan0: disabling VHT as WMM/QoS is
> not supported by the AP
> [ 7745.007736] wlan0: associate with 68:7f:74:b8:05:82 (try 1/3)
> [ 7745.011456] wlan0: RX AssocResp from 68:7f:74:b8:05:82
> (capab=0x411 status=0 aid=5)
> [ 7745.011529] wlan0: associated
> [ 8120.812482] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 8120.812642] [drm] capturing error event; look for more
> information in /debug/dri/0/i915_error_state
> [ 8122.328682] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 8122.328845] [drm:i915_reset] *ERROR* GPU hanging too fast,
> declaring wedged!
> [ 8122.328850] [drm:i915_reset] *ERROR* Failed to reset chip.
>
> full log is here: http://fpaste.org/7xH8/
>
> as for good kernels from what I remember 3.6.0-rc1. I can try a
> bisect on this once I get the time. or if anybody has a patch I can
> test.

Can you please rehand your machine, and then grab the i915_error_state
from debugfs? That contains the gpu hang dump we need to diagnose things.

And the bisect would obviously be awesome.

Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

took a bit to trigger, but finally fired off.

here is a link to the file..: intel_error_decode
http://www.filefactory.com/file/22bypyjhs4mx

the file was to large to send to the list.. let me know if you need more info with this.
also if anybody has any ideas to trigger this would be appreciated so the bisect can be more precise. right now dont even think its worth it, due to not being able to trigger the crash causing the bisect to go astray and pointing to a wrong commit(which has happened in the past) but then again you never know.

Justin P. Mattock
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/