Re: [RFC] respect the referenced bit of KVM guest pages?

From: Andrea Arcangeli
Date: Thu Aug 06 2009 - 06:21:20 EST


On Thu, Aug 06, 2009 at 01:18:47PM +0300, Avi Kivity wrote:
> Reasonable; if you depend on a hint from userspace, that hint can be
> used against you.

Correct, that is my whole point. Also we never know if applications
are mmapping huge files with MAP_EXEC just because they might need to
trampoline once in a while, or do some little JIT thing once in a
while. Sometime people open files with O_RDWR even if they only need
O_RDONLY. It's not a bug, but radically altering VM behavior because
of a bitflag doesn't sound good to me.

I certainly see this tends to help as it will reactivate all
.text. But this signals current VM behavior is not ok for small
systems IMHO if such an hack is required. I prefer a dynamic algorithm
that when active list grow too much stop reactivating pages and
reduces the time for young bit activation only to the time the page
sits on the inactive list. And if active list is small (like 128M
system) we fully trust young bit and if it set, we don't allow it to
go in inactive list as it's quick enough to scan the whole active
list, and young bit is meaningful there.

The issue I can see is with huge system and million pages in active
list, by the time we can it all, too much time has passed and we don't
get any meaningful information out of young bit. Things are radically
different on all regular workstations, and frankly regular
workstations are very important too, as I suspect there are more users
running on <64G systems than on >64G systems.

> How about, for every N pages that you scan, evict at least 1 page,
> regardless of young bit status? That limits overscanning to a N:1
> ratio. With N=250 we'll spend at most 25 usec in order to locate one
> page to evict.

Yes exactly, something like that I think will be dynamic, and then we
can drop VM_EXEC check and solve the issues on large systems while
still not almost totally ignoring young bit on small systems.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/