Re: [Intel-gfx] [git pull] drm fixes

From: Xi Ruoyao
Date: Tue Mar 24 2015 - 23:49:56 EST




On 03/25/2015 at 12:54 AM, Josh Boyer wrote:
On Tue, Mar 24, 2015 at 12:49 PM, Daniel Vetter <daniel@xxxxxxxx> wrote:
On Tue, Mar 24, 2015 at 05:48:31PM +0100, Daniel Vetter wrote:
On Tue, Mar 24, 2015 at 12:10:28PM -0400, Josh Boyer wrote:
On Tue, Mar 24, 2015 at 10:46 AM, Josh Boyer <jwboyer@xxxxxxxxxxxxxxxxx> wrote:
On Tue, Mar 24, 2015 at 10:34 AM, Daniel Vetter <daniel@xxxxxxxx> wrote:
On Tue, Mar 24, 2015 at 10:22:30AM -0400, Josh Boyer wrote:
On Tue, Mar 24, 2015 at 9:57 AM, Josh Boyer <jwboyer@xxxxxxxxxxxxxxxxx> wrote:
On Tue, Mar 24, 2015 at 9:40 AM, Daniel Vetter <daniel@xxxxxxxx> wrote:
On Tue, Mar 24, 2015 at 09:15:32AM -0400, Josh Boyer wrote:
On Tue, Mar 24, 2015 at 3:32 AM, Daniel Vetter <daniel@xxxxxxxx> wrote:
On Mon, Mar 23, 2015 at 02:34:27PM -0400, Josh Boyer wrote:
On Mon, Mar 23, 2015 at 11:33 AM, Josh Boyer <jwboyer@xxxxxxxxxxxxxxxxx> wrote:

<snip>

Xi Ruoyao (1):
drm/i915: Ensure plane->state->fb stays in sync with plane->fb
Turns out to be that commit.

git bisect start 'drivers/gpu/drm/i915/'
# good: [b314acaccd7e0d55314d96be4a33b5f50d0b3344] Merge branch
'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
git bisect good b314acaccd7e0d55314d96be4a33b5f50d0b3344
# bad: [bc465aa9d045feb0e13b4a8f32cc33c1943f62d6] Linux 4.0-rc5
git bisect bad bc465aa9d045feb0e13b4a8f32cc33c1943f62d6
# bad: [319c1d420a0b62d9dbb88104afebaabc968cdbfa] drm/i915: Ensure
plane->state->fb stays in sync with plane->fb
git bisect bad 319c1d420a0b62d9dbb88104afebaabc968cdbfa
# first bad commit: [319c1d420a0b62d9dbb88104afebaabc968cdbfa]
drm/i915: Ensure plane->state->fb stays in sync with plane->fb

Doing a straight revert on top of 4.0-rc5 makes things work again,
albeit with the WARN_ON(obj->frontbuffer_bits) splat still being
there.
Can you please test the tip of drm-fixes:

commit 8218c3f4df3bb1c637c17552405039a6dd3c1ee1
Author: Daniel Vetter <daniel.vetter@xxxxxxxx>
Date: Fri Feb 27 12:58:13 2015 +0100

drm: Fixup racy refcounting in plane_force_disable

http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes&id=8218c3f4df3bb1c637c17552405039a6dd3c1ee1

Because fumble that patch didn't make it to drm-fixes a while ago and
instead landed in drm-next.
That seems to have helped with totally different issues a macbook I
have was seeing. However, it still doesn't fix the issue with the
Celeron based NUC machine.

I built a kernel based on Linus' latest tree as of this morning,
without reverting 319c1d4 and adding the commit you pointed to. The
NUC still won't boot without HDMI connected. With HDMI connected I
still see the trace below. If I do the blacklist and then insmod
dance with HDMI unplugged it shows the same spew I reported yesterday
which starts with the same backtrace.

I'll try building a kernel with 319c1d4 reverted + your patch. I
suspect things will work fine with that combination because the two
issues are unrelated.
Can you please boot with drm.debug=0xff for the below case and grab
complete dmesg? There'll be a lot of crap in the logs, you might need to
blow up the logbuf size massively. But that log should contain everything
I need to figure out where that framebuffer we're blowing up on is going.
I provided both with HDMI attached and without (via insmod). If you
want them emailed directly let me know, but they were large.

Boot with drm.debug=0xff and HDMI connected:

https://jwboyer.fedorapeople.org/pub/drm-ff-dmesg.txt

Boot with drm.debug=0xff without HDMI connected and i915 loaded via
manual insmod after boot:

https://jwboyer.fedorapeople.org/pub/drm-ff-no-hdmi-insmod.txt
Here's one more from the macbook I mentioned. It's showing the same
kref.h splat:

https://jwboyer.fedorapeople.org/pub/drm-ff-macbook.txt
Ok there's at least one fixup for which we've failed to apply when porting
the fb refcounting fix from -next. Can you please cherry-pick

commit f55548b5af87ebfc586ca75748947f1c1b1a4a52
Author: Damien Lespiau <damien.lespiau@xxxxxxxxx>
Date: Thu Feb 5 18:30:20 2015 +0000

drm/i915: Don't try to reference the fb in get_initial_plane_config()

From linux-next?
Yes, building now. Will let you know as soon as I test it on both machines.
OK, with that commit applied I no longer get the kref.h splat and the
NUC machine boots headless. I still see the backtrace below on both
the NUC and the macbook. I have a copy of it with drm.debug=0xff from
the NUC here:

https://jwboyer.fedorapeople.org/pub/nuc-drm-debug-ff-with-fixes.txt

Getting better at least :).
Ok thanks for testing. I'll look at that one tomorrow, wasted too much
time with trying to resurrect a few machines that should have matched the
common parts of what goes wrong here.

Jani, can you please cherry-pick the above commit to -fixes?
Actually add Jani this time around ...
-Daniel

One more question: Is the frontbuffer_bits splat now also gone? That was
the one I have no clue about, but since somewhere around 4.0-rc it started
poppping up in a few places ... Thus far it was always the canary for some
other bug though.
As far as I can tell, it's gone. I don't see it on any of my i915
machines running the kernel with those two patches. I'll keep an eye
out for it as we work through 4.0-rcX.

josh
It's fortunately my computer didn't stuck. But it's unfortuantely
my patch causing so much trouble. I should've research commit
in linux-next more before applying it to mainline.

I found many WARNINGs in kernel log after Josh reported this bug.
I will try Damien's solution.

--
Xi Ruoyao
School of Aerospace Science and Technology
Xidian University, Xi'an, China

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/