Re: [BUG] completely bonkers use of set_need_resched +VM_FAULT_NOPAGE

From: Chris Wilson
Date: Thu Sep 12 2013 - 13:02:14 EST

Next message: Randy Dunlap: "Re: [PATCH] dma: use %pa to print dma_addr_t"
Previous message: Guenter Roeck: "Re: [RFC PATCH 4/6] extcon-gpio: Add devicetree support"
In reply to: Peter Zijlstra: "Re: [BUG] completely bonkers use of set_need_resched +VM_FAULT_NOPAGE"
Next in thread: Peter Zijlstra: "Re: [BUG] completely bonkers use of set_need_resched +VM_FAULT_NOPAGE"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Sep 12, 2013 at 06:22:10PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 12, 2013 at 05:58:49PM +0200, Daniel Vetter wrote:
> > On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >> The one in ttm is just bonghits to shut up lockdep: ttm can recurse
> > >> into it's own pagefault handler and then deadlock, the trylock just
> > >> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
> > >> fun userspace did and now have testcases for them. The right solution
> > >> to fix this is to use copy_to|from_user_atomic in ttm everywhere it
> > >> holds locks and have slowpaths which drops locks, copies stuff into a
> > >> temp allocation and then continues. At least that's how we've fixed
> > >> all those inversions in i915-gem. I'm not volunteering to fix this ;-)
> > >
> > > Yikes.. so how common is it? If I simply rip the set_need_resched() out
> > > it will 'spin' on the fault a little longer until a 'natural' preemption
> > > point -- if such a thing is every going to happen.
> >
> > It's a case of "our userspace doesn't do this", so as long as you're
> > not evil and frob the drm device nodes of ttm drivers directly the
> > deadlock will never happen. No idea how much contention actually
> > happens on e.g. shared buffer objects - in i915 we have just one lock
> > and so suffer quite a bit more from contention. So no idea how much
> > removing the yield would hurt.
>
> If 'sane' userspace is never supposed to do this, then only insane
> userspace is going to hurt from this and that's a GOOD (tm) thing,
> right? ;-)

Not quite, as it would be possible for the evil userspace to trigger a
GPU hang that would cause the sane userspace to spin indefinitely
waiting for the error recovery to kick in.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Randy Dunlap: "Re: [PATCH] dma: use %pa to print dma_addr_t"
Previous message: Guenter Roeck: "Re: [RFC PATCH 4/6] extcon-gpio: Add devicetree support"
In reply to: Peter Zijlstra: "Re: [BUG] completely bonkers use of set_need_resched +VM_FAULT_NOPAGE"
Next in thread: Peter Zijlstra: "Re: [BUG] completely bonkers use of set_need_resched +VM_FAULT_NOPAGE"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]