Re: Found the commit that causes the OOMs

From: Wu Fengguang
Date: Sun Jun 28 2009 - 11:04:36 EST


On Sun, Jun 28, 2009 at 10:49:52PM +0800, KOSAKI Motohiro wrote:
> >> In David's OOM case, there are two symptoms:
> >> 1) 70000 unaccounted/leaked pages as found by Andrew
> >> Â (plus rather big number of PG_buddy and pagetable pages)
> >> 2) almost zero active_file/inactive_file; small inactive_anon;
> >> Â many slab and active_anon pages.
> >>
> >> In the situation of (2), the slab cache is _under_ scanned. So David
> >> got OOM when vmscan should have squeezed some free pages from the slab
> >> cache. Which is one important side effect of MinChan's patch?
> >
> > My patch's side effect is (2).
> >
> > My guessing is following as.
> >
> > 1. The number of page scanned in shrink_slab is increased in shrink_page_list.
> > And it is doubled for mapped page or swapcache.
> > 2. shrink_page_list is called by shrink_inactive_list
> > 3. shrink_inactive_list is called by shrink_list
> >
> > Look at the shrink_list.
> > If inactive lru list is low, it always call shrink_active_list not
> > shrink_inactive_list in case of anon.
> > It means it doesn't increased sc->nr_scanned.
> > Then shrink_slab can't shrink enough slab pages.
> > So, David OOM have a lot of slab pages and active anon pages.
> >
> > Does it make sense ?
> > If it make sense, we have to change shrink_slab's pressure method.
> > What do you think ?
>
> I'm confused.
>
> if system have no swap, get_scan_ratio() always return anon=0%.
> Then, the numver of inactive_anon is not effect to sc.nr_scanned.

You are right. Hehe, so that's not a real side effect.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/