On large memory systems, the VM can spend way too much time scanningI think I've run into #2 with kvm on s390 lately. I've tried a large setup with 200 guests running WebSphere. The guest memory is stored in anonymous pages, all guests are started up from a script so everything is dirty initially. I use 200gig swap with 45 gig main memory for the scenario. Everything runs perfect except when vmscan is triggered for the first time: it starts to writeback, and the whole system freezes until it has paged out the 15gig in the inactive list. From there on, everything runs smooth again with a constant swap rate.
through pages that it cannot (or should not) evict from memory. Not
only does it use up CPU time, but it also provokes lock contention
and can leave large systems under memory presure in a catatonic state.
Against 2.6.26-rc2-mm1
This patch series improves VM scalability by:
1) putting filesystem backed, swap backed and non-reclaimable pages
onto their own LRUs, so the system only scans the pages that it
can/should evict from memory
2) switching to SEQ replacement for the anonymous LRUs, so the
number of pages that need to be scanned when the system
starts swapping is bound to a reasonable number
3) keeping non-reclaimable pages off the LRU completely, so the
VM does not waste CPU time scanning them. Currently only
ramfs and SHM_LOCKED pages are kept on the noreclaim list,
mlock()ed VMAs will be added later