Re: Default zone_reclaim_mode = 1 on NUMA kernel is bad forfile/email/web servers

From: KAMEZAWA Hiroyuki
Date: Sun Sep 26 2010 - 22:11:47 EST


On Mon, 27 Sep 2010 11:04:54 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:

> > On Thu, 16 Sep 2010 19:01:32 +0900 (JST)
> > KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
> >
> > > Yes, sadly intel motherboard turn on zone_reclaim_mode by default. and
> > > current zone_reclaim_mode doesn't fit file/web server usecase ;-)
> > >
> > > So, I've created new proof concept patch. This doesn't disable zone_reclaim
> > > at all. Instead, distinguish for file cache and for anon allocation and
> > > only file cache doesn't use zone-reclaim.
> > >
> > > That said, high-end hpc user often turn on cpuset.memory_spread_page and
> > > they avoid this issue. But, why don't we consider avoid it by default?
> > >
> > >
> > > Rob, I wonder if following patch help you. Could you please try it?
> > >
> > >
> > > Subject: [RFC] vmscan: file cache doesn't use zone_reclaim by default
> > >
> >
> > Hm, can't we use migration of file caches rather than pageout in
> > zone_reclaim_mode ? Doent' it fix anything ?
>
> Doesn't.
>
> Two problem. 1) Migration makes copy. then it's slower than zone_reclaim=0
> 2) Migration is only effective if target node has much free pages. but it
> is not generic assumption.
>
> For this case, zone_reclaim_mode=0 is best. my patch works as second best.
> your one works as third.
>

Hmm. I'm not sure whether it's "slower" or not. And Migraion doesn't
assume target node because it can use zonelist fallback.

I'm just has concerns that kicked-out pages will be paged-in soon.

But ok, maybe complicated.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/