Re: [Intel-gfx] [PATCH] drm/i915: Selectively enable self-reclaim

From: Linus Torvalds
Date: Thu Jul 01 2010 - 20:08:11 EST


On Thu, Jul 1, 2010 at 3:34 PM, M. Vefa Bicakci <bicave@xxxxxxxxxxxxxxx> wrote:
>
> Based on my testing, I am happy to report that the change you suggest
> fixes the "memory corruption (segfaults) after thaw" issue for me.
> I can't thank you enough times for this.

Hey, goodie. And you're the one to be thanked - bisecting it down to
that commit that wasn't _meant_ to have any real semantic changes
(except for the bug-fix of racy mapping gfp-flags update) is what
really cracked the lid on the problem.

> Now, the obligatory question: Could we have this fix applied to 2.6.32,
> 2.6.33 and 2.6.34 ?

No problem, except we should first determine exactly what flags are
the appropriate ones. My original patch was obviously not even
compile-tested, and I actually meant for people to use GFP_HIGHUSER
rather than __GFP_HIGHMEM. That contains all the "regular" allocation
flags (but not the __GFP_MOVABLE, which is still just a suspicion of
being the core reason for the problem).

And the original DRM code had:

GFP_HIGHUSER |
__GFP_COLD |
__GFP_FS |
__GFP_RECLAIMABLE |
__GFP_NORETRY |
__GFP_NOWARN |
__GFP_NOMEMALLOC

which is not entirely sensible (__GFP_FS is already part of
GFP_HIGHUSER, for example), and two of the flags (NORETRY and NOWARN)
are the ones the driver wants to do conditionally.

But that still leaves the question about __GFP_COLD (probably sane),
__GFP_RECLAIMABLE (I wonder about that one) and __GFP_NOMEMALLOC
(usually used together with NORETRY, and I'm not at all sure it makes
sense as a base flag).

So I suspect the final patch should not look like the one you tested,
but instead likely have

GFP_HIGHUSER | __GFP_COLD

and possibly the __GFP_RECLAIMABLE flag too instead of just the bare
__GFP_HIGHMEM..

(Well, we already had that __GFP_COLD there from before, so it's
really about replacing __GFP_HIGHMEM with something like "GFP_HIGHUSER
| __GFP_RECLAIMABLE").

But its' great to hear that this does seem to be the underlying cause.
If you could test with that GFP_HIGHUSER | __GFP_RECLAIMABLE, that
would be a good thing. After all - maybe the problem was triggered by
some other flag than __GFP_MOVABLE, and as such, having some
additional testing with a bigger set of allocation flags would be a
really good thing.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/