Re: Page cache and swapping

Mark Hemment (markhe@sco.COM)
Thu, 26 Jun 1997 15:31:21 +0100 (BST)


On Thu, 26 Jun 1997, Benjamin C R LaHaise wrote:
> This is the difficult point - you need to walk the page tables of all
> processes to find out which ones are using the page - a very costly
> operation. This is why the current system doesn't do this.

Alternatively, it is possible to maintain pte-mapping chains.
These allow all ptes which reference a physical page to be found quickly.
Ideal for reaping shared pages, or moving pages around incore to create
a large, physically contigious, region.

I started work on the basis for this a few months back, but the
implementation sucked mud (ie. slow). The overhead is in maintaining the
chains, which I had as unordered, double-link, lists of dynamically
allocated objects - called vm_nodes.
The vm_nodes where linked off another object, called a pagable_node.
A pagable_node contained;
o The vm_nodes
o A link to a cut-down "struct page" (by using a link, I could
copy a physical page's contents to another physical page,
and update the link).
o Various locks (I/O locks, memory locks)
o Per-type data (types being Named, Shm and Anonymous pages).
A vm_node contained;
o Ptr to the pte
o Ptr to the vm_area
o Linkage ptr to the next vm_node

To help get the peformance up, I had lazy coalescing in the free-page
pool, and some other caches.
I started to try other things as well; anonymous-inodes, a "swap" file-
system,....

Anyway, before I waffle for too long, it all got too complicated and I
gave up.

I have re-started this work again. I was hacking last night to try to
move some of the page-allocation towards a new scheme (a v. simple change
which has a few small benefits all by itself).

I've deciced to throw the vm_nodes away, and allocate _three_ pages for
each page-table; one (as currently is) for the pte, another for the
vm_area ptr (which is needed for flush operations), and another for
linkage.

Changing the Memory Management is very non-trivial. So don't hold your
breath waiting, but I'll hopefully have a patch out in the next 4
months...
Email on the subject welcomed at;
markhe@nextd.demon.co.uk

Regards,

markhe

-----------------------------------------------------
Mark Hemment UNIX/C Software Engineer (contractor)
"Success has many fathers. Failure is a b**tard"
-----------------------------------------------------