Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrinkany objects

From: KOSAKI Motohiro
Date: Tue Jul 12 2011 - 20:25:41 EST


(2011/07/12 19:06), Chris Wilson wrote:
> On Tue, 12 Jul 2011 18:36:50 +0900, KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>> Hi,
>>
>> sorry for the delay.
>>
>>> On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@xxxxxxxxxx> wrote:
>>>> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>>>>
>>>>> Now, i915_gem_inactive_shrink() should return -1 instead of 0 if it
>>>>> can't take a lock. Otherwise, vmscan is getting a lot of confusing
>>>>> because vmscan can't distinguish "can't take a lock temporary" and
>>>>> "we've shrank all of i915 objects".
>>>>
>>>> This doesn't look like the cleanest change possible. I think it would be
>>>> better if the shrink function could uniformly return an error
>>>> indication so that we wouldn't need the weird looking conditional return.
>>
>> shrink_icache_memory() is good sample code.
>> It doesn't take a lock if sc->nr_to_scan==0. i915_gem_inactive_shrink() should do
>> it too, ideally.
>>
>> My patch only take a first-aid.
>>
>> Plus, if I understand correctly, i915_gem_inactive_shrink() have more fundamental
>> issue. actually, shrinker code shouldn't use mutex. Instead, use spinlock.
> Why? The shrinker code is run in a non-atomic context that is explicitly
> allowed to wait, or so I thought. Where's the caveat that prevents mutex?
> Why doesn't the kernel complain?

The matter is not in contention. The problem is happen if the mutex is taken
by shrink_slab calling thread. i915_gem_inactive_shrink() have no way to shink
objects. How do you detect such case?

>> IOW, Don't call kmalloc(GFP_KERNEL) while taking dev->struct_mutex. Otherwise,
>> vmscan in its call path completely fail to shrink i915 cache and it makes big
>> memory reclaim confusing if i915 have a lot of shrinkable pages.
>
> i915 can have several GiB of shrinkable pages. Of which 2 GiB may be tied
> up in the GTT upon which we have to wait for the GPU to release. In the
> future, we will be able to tie up all of physical memory.
>
> There is only a single potential kmalloc in the shrinker path, for which
> we could preallocate a request so that we always have one available here.

Again, waiting is no problem if it is enough little time. btw, I think
preallocation must be implemented, otherwise shrinker have no guarantee to
shrink.

thanks.


>>> Unless I am mistaken, and there are more patches in flight, the return
>>> code from i915_gem_inactive_shrink() is promoted to unsigned long and then
>>> used in the calculation of how may objects to evict...
>>
>> shrinker->shrink has int type value. you can't change i915_gem_inactive_shrink()
>> unless generic shrinker code.
>> Do you really want to change it?
>
> No, just pointing out that the patch causes warnings from the shrinker
> code as it tries to process (unsigned long)-1 objects. shrink_slab() does
> not use <0 as an error code!

Look.

unsigned long shrink_slab(struct shrink_control *shrink,
unsigned long nr_pages_scanned,
unsigned long lru_pages)
{
(snip)
while (total_scan >= SHRINK_BATCH) {
long this_scan = SHRINK_BATCH;
int shrink_ret;
int nr_before;

nr_before = do_shrinker_shrink(shrinker, shrink, 0);
shrink_ret = do_shrinker_shrink(shrinker, shrink,
this_scan);
if (shrink_ret == -1)
break;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/