Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects

From: Chris Wilson
Date: Tue Jul 12 2011 - 06:06:32 EST


On Tue, 12 Jul 2011 18:36:50 +0900, KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
> Hi,
>
> sorry for the delay.
>
> > On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@xxxxxxxxxx> wrote:
> >> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
> >>
> >>> Now, i915_gem_inactive_shrink() should return -1 instead of 0 if it
> >>> can't take a lock. Otherwise, vmscan is getting a lot of confusing
> >>> because vmscan can't distinguish "can't take a lock temporary" and
> >>> "we've shrank all of i915 objects".
> >>
> >> This doesn't look like the cleanest change possible. I think it would be
> >> better if the shrink function could uniformly return an error
> >> indication so that we wouldn't need the weird looking conditional return.
>
> shrink_icache_memory() is good sample code.
> It doesn't take a lock if sc->nr_to_scan==0. i915_gem_inactive_shrink() should do
> it too, ideally.
>
> My patch only take a first-aid.
>
> Plus, if I understand correctly, i915_gem_inactive_shrink() have more fundamental
> issue. actually, shrinker code shouldn't use mutex. Instead, use spinlock.
Why? The shrinker code is run in a non-atomic context that is explicitly
allowed to wait, or so I thought. Where's the caveat that prevents mutex?
Why doesn't the kernel complain?

> IOW, Don't call kmalloc(GFP_KERNEL) while taking dev->struct_mutex. Otherwise,
> vmscan in its call path completely fail to shrink i915 cache and it makes big
> memory reclaim confusing if i915 have a lot of shrinkable pages.

i915 can have several GiB of shrinkable pages. Of which 2 GiB may be tied
up in the GTT upon which we have to wait for the GPU to release. In the
future, we will be able to tie up all of physical memory.

There is only a single potential kmalloc in the shrinker path, for which
we could preallocate a request so that we always have one available here.

> > Unless I am mistaken, and there are more patches in flight, the return
> > code from i915_gem_inactive_shrink() is promoted to unsigned long and then
> > used in the calculation of how may objects to evict...
>
> shrinker->shrink has int type value. you can't change i915_gem_inactive_shrink()
> unless generic shrinker code.
> Do you really want to change it?

No, just pointing out that the patch causes warnings from the shrinker
code as it tries to process (unsigned long)-1 objects. shrink_slab() does
not use <0 as an error code!
-Chris

--
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/