[BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

From: Peter Zijlstra
Date: Thu Sep 12 2013 - 11:07:00 EST


Hi Dave,

So I'm poking around the preemption code and stumbled upon:

drivers/gpu/drm/i915/i915_gem.c: set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c: set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c: set_need_resched();
drivers/gpu/drm/udl/udl_gem.c: set_need_resched();

All these sites basically do:

while (!trylock())
yield();

which is a horrible and broken locking pattern.

Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
task that preempted the lock holder at FIFOn.

Secondly the implementation is worse than usual by abusing
VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
doesn't retry, but you're using it as a get out of fault path. And
you're using set_need_resched() which is not something a driver should
_ever_ touch.

Now I'm going to take away set_need_resched() -- and while you can
'reimplement' it using set_thread_flag() you're not going to do that
because it will be broken due to changes to the preempt code.

So please as to fix ASAP and don't allow anybody to trick you into
merging silly things like that again ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/