Re: [Intel-gfx] [PATCH] drm/i915: Selectively enable self-reclaim

From: Linus Torvalds
Date: Sat Jul 17 2010 - 15:17:19 EST


On Sat, Jul 17, 2010 at 11:58 AM, M. Vefa Bicakci
<bicave@xxxxxxxxxxxxxxx> wrote:
>
> The kernel with d8e0902806c0bd2ccc4f6a267ff52565a3ec933b reverted
> was able to hibernate/thaw at least 40 times in one go, while
> the one with your fix applied was able to hibernate/thaw at most
> 17 times (in two separate trials) after which it crashed during
> the next thaw.

Ok. I do wonder if the bug is possibly something entirely different,
and the allocation patterns just happen to expose/hide it. Reverting
the original commit should be pretty darn close to applying my fix.
Any remaining issues would seem to be more about the actual bug in the
original code (racing on changing that mapping->gfp_mask witthout any
locking) than about anything else.

> Is there anything I can do find out the correct flags to use
> in addition to GFP_HIGHUSER ? Can I do something like a bisection
> for the flags one by one starting from the pre 2.6.32.8 state?
> If you could outline a procedure to do this, I would be glad to
> follow it.

You can try adding __GFP_RECLAIMABLE | __GFP_NOMEMALLOC to the set of
flags in i915_gem_object_get_pages(). That's what the old code had
(and then it played games with NORETRY|NOWARN). I've attached a patch
(UNTESTED! Maybe it won't compile).

Now, I don't see why those flags would matter, but NOMEMALLOC in
particular does make a difference for memory allocation patterns under
low memory conditions, so maybe it could make a difference.

And if it _does_ make a difference, it would be interesting to know
which of the two flags matter. So try both flags first, and see if
that gets you something reliable. And if it does, remove one of them
and try again - just to see _which_ flag it is that the i915 driver
would care about. That would hopefully give us a hint.

> Sorry to bug you again about this problem because of incomplete
> testing on my part.

Oh, never be sorry for testing even more, and testing something nobody
else bothered to test.

Linus

Attachment: diff
Description: Binary data