Re: [Intel-gfx] [PATCH] [RFC] mm/shrinker: Add a shrinker flag toalways shrink a bit

From: Dave Chinner
Date: Wed Sep 18 2013 - 19:52:46 EST


[my keyboard my be on the fritz - it's not typing what I'm thinking...]

On Thu, Sep 19, 2013 at 06:38:22AM +1000, Dave Chinner wrote:
> On Wed, Sep 18, 2013 at 12:38:23PM +0200, Knut Petersen wrote:
> > On 18.09.2013 11:10, Daniel Vetter wrote:
> >
> > Just now I prepared a patch changing the same function in vmscan.c
> > >Also, this needs to be rebased to the new shrinker api in 3.12, I
> > >simply haven't rolled my trees forward yet.
> >
> > Well, you should. Since commit 81e49f shrinker->count_objects might be
> > set to SHRINK_STOP, causing shrink_slab_node() to complain loud and often:
> >
> > [ 1908.234595] shrink_slab: i915_gem_inactive_scan+0x0/0x9c negative objects to delete nr=-xxxxxxxxx
> >
> > The kernel emitted a few thousand log lines like the one quoted above during the
> > last few days on my system.
> >
> > >diff --git a/mm/vmscan.c b/mm/vmscan.c
> > >index 2cff0d4..d81f6e0 100644
> > >--- a/mm/vmscan.c
> > >+++ b/mm/vmscan.c
> > >@@ -254,6 +254,10 @@ unsigned long shrink_slab(struct shrink_control *shrink,
> > > total_scan = max_pass;
> > > }
> > >+ /* Always try to shrink a bit to make forward progress. */
> > >+ if (shrinker->evicts_to_page_lru)
> > >+ total_scan = max_t(long, total_scan, batch_size);
> > >+
> > At that place the error message is already emitted.
> > > /*
> > > * We need to avoid excessive windup on filesystem shrinkers
> > > * due to large numbers of GFP_NOFS allocations causing the
> >
> > Have a look at the attached patch. It fixes my problem with the erroneous/misleading
> > error messages, and I think it´s right to just bail out early if SHRINK_STOP is found.
> >
> > Do you agree ?
>
> No, that's wrong. ->count_objects should never ass SHRINK_STOP.

*pass

> Indeed, it should always return a count of objects in the cache,
> regardless of the context.
>
> SHRINK_STOP is for ->scan_objects to tell the shrinker it can make

*can't

> any progress due to the context it is called in. This allows the
> shirnker to defer the work to another call in a different context.
> However, if ->count-objects doesn't return a count, the work that
> was supposed to be done cannot be deferred, and that is what

*why

> ->count_objects should always return the number of objects in the
> cache.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/