Re: [PATCH]vmscan: handle underflow for get_scan_ratio

From: Shaohua Li
Date: Mon Apr 05 2010 - 21:27:52 EST


On Sun, Apr 04, 2010 at 08:48:38AM +0800, Wu, Fengguang wrote:
> On Fri, Apr 02, 2010 at 02:50:52PM +0800, Li, Shaohua wrote:
> > On Wed, Mar 31, 2010 at 01:53:27PM +0800, KOSAKI Motohiro wrote:
> > > > > On Tue, Mar 30, 2010 at 02:08:53PM +0800, KOSAKI Motohiro wrote:
> > > > > > Hi
> > > > > >
> > > > > > > Commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26 introduces a regression.
> > > > > > > With it, our tmpfs test always oom. The test has a lot of rotated anon
> > > > > > > pages and cause percent[0] zero. Actually the percent[0] is a very small
> > > > > > > value, but our calculation round it to zero. The commit makes vmscan
> > > > > > > completely skip anon pages and cause oops.
> > > > > > > An option is if percent[x] is zero in get_scan_ratio(), forces it
> > > > > > > to 1. See below patch.
> > > > > > > But the offending commit still changes behavior. Without the commit, we scan
> > > > > > > all pages if priority is zero, below patch doesn't fix this. Don't know if
> > > > > > > It's required to fix this too.
> > > > > >
> > > > > > Can you please post your /proc/meminfo and reproduce program? I'll digg it.
> > > > > >
> > > > > > Very unfortunately, this patch isn't acceptable. In past time, vmscan
> > > > > > had similar logic, but 1% swap-out made lots bug reports.
> > > > > if 1% is still big, how about below patch?
> > > >
> > > > This patch makes a lot of sense than previous. however I think <1% anon ratio
> > > > shouldn't happen anyway because file lru doesn't have reclaimable pages.
> > > > <1% seems no good reclaim rate.
> > >
> > > Oops, the above mention is wrong. sorry. only 1 page is still too big.
> > > because under streaming io workload, the number of scanning anon pages should
> > > be zero. this is very strong requirement. if not, backup operation will makes
> > > a lot of swapping out.
> > Sounds there is no big impact for the workload which you mentioned with the patch.
> > please see below descriptions.
> > I updated the description of the patch as fengguang suggested.
> >
> >
> >
> > Commit 84b18490d introduces a regression. With it, our tmpfs test always oom.
> > The test uses a 6G tmpfs in a system with 3G memory. In the tmpfs, there are
> > 6 copies of kernel source and the test does kbuild for each copy. My
> > investigation shows the test has a lot of rotated anon pages and quite few
> > file pages, so get_scan_ratio calculates percent[0] to be zero. Actually
> > the percent[0] shoule be a very small value, but our calculation round it
> > to zero. The commit makes vmscan completely skip anon pages and cause oops.
> >
> > To avoid underflow, we don't use percentage, instead we directly calculate
> > how many pages should be scaned. In this way, we should get several scan pages
> > for < 1% percent. With this fix, my test doesn't oom any more.
> >
> > Note, this patch doesn't really change logics, but just increase precise. For
> > system with a lot of memory, this might slightly changes behavior. For example,
> > in a sequential file read workload, without the patch, we don't swap any anon
> > pages. With it, if anon memory size is bigger than 16G, we will say one anon page
>
> see?
Thanks, will send a updated against -mm since we reverted the offending patch.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/