Re: [PATCH 0/3] Unmapped page cache control (v5)

From: KOSAKI Motohiro
Date: Fri Apr 01 2011 - 03:57:10 EST


Hi

> > 1) zone reclaim doesn't work if the system has multiple node and the
> > workload is file cache oriented (eg file server, web server, mail server, et al).
> > because zone recliam make some much free pages than zone->pages_min and
> > then new page cache request consume nearest node memory and then it
> > bring next zone reclaim. Then, memory utilization is reduced and
> > unnecessary LRU discard is increased dramatically.
> >
> > SGI folks added CPUSET specific solution in past. (cpuset.memory_spread_page)
> > But global recliam still have its issue. zone recliam is HPC workload specific
> > feature and HPC folks has no motivation to don't use CPUSET.
>
> I am afraid you misread the patches and the intent. The intent to
> explictly enable control of unmapped pages and has nothing
> specifically to do with multiple nodes at this point. The control is
> system wide and carefully enabled by the administrator.

Hm. OK, I may misread.
Can you please explain the reason why de-duplication feature need to selectable and
disabled by defaut. "explicity enable" mean this feature want to spot corner case issue??


> > 2) Before 2.6.27, VM has only one LRU and calc_reclaim_mapped() is used to
> > decide to filter out mapped pages. It made a lot of problems for DB servers
> > and large application servers. Because, if the system has a lot of mapped
> > pages, 1) LRU was churned and then reclaim algorithm become lotree one. 2)
> > reclaim latency become terribly slow and hangup detectors misdetect its
> > state and start to force reboot. That was big problem of RHEL5 based banking
> > system.
> > So, sc->may_unmap should be killed in future. Don't increase uses.
> >
>
> Can you remove sc->may_unmap without removing zone_reclaim()? The LRU
> churn can be addressed at the time of isolation, I'll send out an
> incremental patch for that.

At least, I don't plan to do it. because current zone_reclaim() works good on SGI
HPC workload and uncareful change can lead to break them. In other word, they
understand their workloads are HPC specific and they understand they do how.

I'm worry about to spread out zone_reclaim() usage _without_ removing its assumption.
I wrote following by last mail.

> In other words, you have to kill following three for getting ack 1) zone
> reclaim oriented reclaim 2) filter based LRU scanning (eg sc->may_unmap)
> 3) fastpath overhead.

But another ways is there, probably. If you can improve zone_reclaim() for more generic
workload and fitting so so much people, I'll ack this.

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/