Re: [RFC/T/D][PATCH 2/2] Linux/Guest cooperative unmapped page cachecontrol

From: Balbir Singh
Date: Tue Jun 15 2010 - 06:19:09 EST


* Avi Kivity <avi@xxxxxxxxxx> [2010-06-15 12:44:31]:

> On 06/15/2010 10:49 AM, Balbir Singh wrote:
> >
> >>All we need is to select the right page to drop.
> >>
> >Do we need to drop to the granularity of the page to drop? I think
> >figuring out the class of pages and making sure that we don't write
> >our own reclaim logic, but work with what we have to identify the
> >class of pages is a good start.
>
> Well, the class of pages are 'pages that are duplicated on the
> host'. Unmapped page cache pages are 'pages that might be
> duplicated on the host'. IMO, that's not close enough.
>

Agreed, but what happens in reality with the code is that it drops
not-so-frequently-used cache (still reusing the reclaim mechanism),
but prioritizing cached memory.

> >>How can the host tell if there is duplication? It may know it has
> >>some pagecache, but it has no idea whether or to what extent guest
> >>pagecache duplicates host pagecache.
> >>
> >Well it is possible in host user space, I for example use memory
> >cgroup and through the stats I have a good idea of how much is duplicated.
> >I am ofcourse making an assumption with my setup of the cached mode,
> >that the data in the guest page cache and page cache in the cgroup
> >will be duplicated to a large extent. I did some trivial experiments
> >like drop the data from the guest and look at the cost of bringing it
> >in and dropping the data from both guest and host and look at the
> >cost. I could see a difference.
> >
> >Unfortunately, I did not save the data, so I'll need to redo the
> >experiment.
>
> I'm sure we can detect it experimentally, but how do we do it
> programatically at run time (without dropping all the pages).
> Situations change, and I don't think we can infer from a few
> experiments that we'll have a similar amount of sharing. The cost
> of an incorrect decision is too high IMO (not that I think the
> kernel always chooses the right pages now, but I'd like to avoid
> regressions from the unvirtualized state).
>
> btw, when running with a disk controller that has a very large
> cache, we might also see duplication between "guest" and host. So,
> if this is a good idea, it shouldn't be enabled just for
> virtualization, but for any situation where we have a sizeable cache
> behind us.
>

It depends, once the disk controller has the cache and the pages in
the guest are not-so-frequently-used we can drop them. Please remember
we still use the LRU to identify these pages.

> >>It doesn't, really. The host only has aggregate information about
> >>itself, and no information about the guest.
> >>
> >>Dropping duplicate pages would be good if we could identify them.
> >>Even then, it's better to drop the page from the host, not the
> >>guest, unless we know the same page is cached by multiple guests.
> >>
> >On the exact pages to drop, please see my comments above on the class
> >of pages to drop.
>
> Well, we disagree about that. There is some value in dropping
> duplicated pages (not always), but that's not what the patch does.
> It drops unmapped pagecache pages, which may or may not be
> duplicated.
>
> >There are reasons for wanting to get the host to cache the data
>
> There are also reasons to get the guest to cache the data - it's
> more efficient to access it in the guest.
>
> >Unless the guest is using cache = none, the data will still hit the
> >host page cache
> >The host can do a better job of optimizing the writeouts
>
> True, especially for non-raw storage. But even there we have to
> fsync all the time to keep the metadata right.
>
> >>But why would the guest voluntarily drop the cache? If there is no
> >>memory pressure, dropping caches increases cpu overhead and latency
> >>even if the data is still cached on the host.
> >>
> >So, there are basically two approaches
> >
> >1. First patch, proactive - enabled by a boot option
> >2. When ballooned, we try to (please NOTE try to) reclaim cached pages
> >first. Failing which, we go after regular pages in the alloc_page()
> >call in the balloon driver.
>
> Doesn't that mean you may evict a RU mapped page ahead of an LRU
> unmapped page, just in the hope that it is double-cached?
>
> Maybe we need the guest and host to talk to each other about which
> pages to keep.
>

Yeah.. I guess that falls into the domain of CMM.

> >>>2. Drop the cache on either a special balloon option, again the host
> >>>knows it caches that very same information, so it prefers to free that
> >>>up first.
> >>Dropping in response to pressure is good. I'm just not convinced
> >>the patch helps in selecting the correct page to drop.
> >>
> >That is why I've presented data on the experiments I've run and
> >provided more arguments to backup the approach.
>
> I'm still unconvinced, sorry.
>

The reason for making this optional is to let the administrators
decide how they want to use the memory in the system. In some
situations it might be a big no-no to waste memory, in some cases it
might be acceptable.

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/