Re: Commit "locking/drm: Kill mutex trickery" causes hangs

From: Hugh Dickins
Date: Sun Oct 30 2016 - 20:10:03 EST


On Mon, 31 Oct 2016, Mike Krinkin wrote:
>
> i faced system hangs with recent linux-next versions, bisect points at the
> commit 3ab7c086d5ec72585ef0 ("locking/drm: Kill mutex trickery"), bisect log
> attached. System just hangs after few minutes when i compile kernel with -j4
> and watch some video simultaneously.
[...]
> also lspci -vvv output:
[...]
> Kernel driver in use: i915
> Kernel modules: i915

Yes, that's hit me too, on mmotm on i915. i915_gem_shrinker_lock()
is broken: but copy the pattern from msm_gem_shrinker_lock() and it's
okay - patch below. Well, okay-ish: I'm reluctant to sign off on that
as more than a quick fix for i915 linux-next users, since the unlock
variable and those _gem_shrinker_lock() wrappers should just be deleted
(if the mutex trickery is indeed to be killed).

And I'm still left with a "sleeping function called from invalid context"
warning, which seems easier to live with: I've not looked to see whether
that's a consequence of the mutex trickery killage or something else.

[ 12.887922] BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:956
[ 12.887925] in_atomic(): 1, irqs_disabled(): 0, pid: 787, name: X
[ 12.887927] 1 lock held by X/787:
[ 12.887928] #0:
[ 12.887929] (
[ 12.887930] &dev->struct_mutex
[ 12.887931] ){+.+.+.}
[ 12.887932] , at:
[ 12.887937] [<ffffffff813e0ccb>] i915_mutex_lock_interruptible+0x23/0x26
[ 12.887939] Preemption disabled at:
[ 12.887943] [<ffffffff813d67c0>] i915_gem_execbuffer_relocate_entry+0x5fb/0x70f
[ 12.887947] CPU: 2 PID: 787 Comm: X Not tainted 4.9.0-rc2-mm1 #5
[ 12.887948] Hardware name: LENOVO 4174EH1/4174EH1, BIOS 8CET51WW (1.31 ) 11/29/2011
[ 12.887950] Call Trace:
[ 12.887955] dump_stack+0x67/0x90
[ 12.887958] ? i915_gem_execbuffer_relocate_entry+0x5fb/0x70f
[ 12.887961] ___might_sleep+0x223/0x23a
[ 12.887963] __might_sleep+0x6d/0x81
[ 12.887966] __pm_runtime_resume+0x35/0x7a
[ 12.887970] intel_runtime_pm_get+0x20/0x7f
[ 12.887973] aliasing_gtt_bind_vma+0x4d/0xb1
[ 12.887975] i915_vma_bind+0x67/0xbd
[ 12.887977] i915_gem_execbuffer_relocate_entry+0xc6/0x70f
[ 12.887981] ? _raw_spin_unlock_irq+0x27/0x45
[ 12.887984] i915_gem_execbuffer_relocate_vma+0x128/0x1dd
[ 12.887987] ? nommu_map_sg+0x9e/0xca
[ 12.887990] ? __i915_vma_do_pin+0x3da/0x421
[ 12.887994] ? i915_gem_execbuffer_reserve_vma.isra.34+0xbc/0x189
[ 12.887996] ? i915_gem_execbuffer_reserve.isra.35+0x32f/0x3da
[ 12.887999] i915_gem_do_execbuffer.isra.36+0x64c/0x10a9
[ 12.888002] i915_gem_execbuffer2+0x15d/0x203
[ 12.888005] drm_ioctl+0x25a/0x38b
[ 12.888007] ? i915_gem_execbuffer+0x2d3/0x2d3
[ 12.888011] vfs_ioctl+0x1c/0x33
[ 12.888014] do_vfs_ioctl+0x5c5/0x601
[ 12.888016] ? __fget+0x17e/0x18f
[ 12.888019] ? expand_files+0x23e/0x23e
[ 12.888021] SyS_ioctl+0x38/0x60
[ 12.888023] entry_SYSCALL_64_fastpath+0x18/0xad

--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -229,8 +229,9 @@ unsigned long i915_gem_shrink_all(struct
static bool i915_gem_shrinker_lock(struct drm_device *dev, bool *unlock)
{
if (!mutex_trylock(&dev->struct_mutex))
- *unlock = false;
+ return false;

+ *unlock = true;
return true;
}