RE: [PATCH 7/8] zswap: add to mm/

From: Dan Magenheimer
Date: Wed Jan 02 2013 - 14:05:26 EST


> From: Dave Hansen [mailto:dave@xxxxxxxxxxxxxxxxxx]
> Subject: Re: [PATCH 7/8] zswap: add to mm/

Hi Dave --

I suspect we are in violent agreement but just to make sure...

Although zswap is the current example, I guess I am discussing
a bigger issue, which IMHO is much more important: How should
compression be utilized in the kernel (if at all)? Zswap is
simply one implementation of in-kernel compression (handling
anonymous pages only) and zcache is another (handling both
anonymous pages and pagecache pages). Each has some
limited policy, and policy defaults built-in, but neither IMHO
is adequately aware of (let alone integrated with) MM policy to
be useful to a broad set of end users and to be enabled by default
by generic distros.

> On 01/02/2013 09:26 AM, Dan Magenheimer wrote:
> > However if one compares the total percentage
> > of RAM used for zpages by zswap vs the total percentage of RAM
> > used by slab, I suspect that the zswap number will dominate,
> > perhaps because zswap is storing primarily data and slab is
> > storing primarily metadata?
>
> That's *obviously* 100% dependent on how you configure zswap. But, that
> said, most of _my_ systems tend to sit with about 5% of memory in
> reclaimable slab

The 5% "sitting" number for slab is somewhat interesting, but
IMHO irrelevant here. The really interesting value is what percent
is used by slab when the system is under high memory pressure; I'd
imagine that number would be much smaller. True?

> which is certainly on par with how I'd expect to see
> zswap used.

You are suggesting that the default zswap_max_pool_percent
should be set to 5? (Current default is 20.) Zswap has little
or no value on a system that would otherwise never swap.
Why would you set the zswap limit so low? IMHO, even 20
may be too low.

> > I don't claim to be any kind of expert here, but I'd imagine
> > that MM doesn't try to manage the total amount of slab space
> > because slab is "a cost of doing business". However, for
> > in-kernel compression to be widely useful, IMHO it will be
> > critical for MM to somehow load balance between total pageframes
> > used for compressed pages vs total pageframes used for
> > normal pages, just as today it needs to balance between
> > active and inactive pages.
>
> The issue isn't about balancing. It's about reclaim where the VM only
> cares about whole pages. If our subsystem (zwhatever or slab) is only
> designed to reclaim _parts_ of pages, can we be successful in returning
> whole pages to the VM?

IMHO, it's about *both* balancing _and_ reclaim. One remaining
major point of debate between zcache and zswap is that zcache
accepts lower density to ensure that whole pages can be easily
returned to the VM (and thus allow balancing) while zswap targets
best density (by using zsmalloc) and doesn't address returning
whole pages to the VM.

> The slab shrinkers only work on parts of pages (singular slab objects).
> Yet, it does appear that they function well enough when we try to
> reclaim from them. I've never seen a slab's sizes spiral out of control
> due to fragmentation.

Perhaps this is because the reclaimable slab objects are mostly
metadata which is highly connected to reclaimable data objects?
E.g. reclaiming most reclaimable data pages also coincidentally
reclaims most slab objects?

(Also, it is not the slab size that would be the issue here but
its density... i.e. if, after shrinking, 1000 pageframes contain
only 2000 various 4-byt objects, that would be "out of control".
Is there any easy visibility into slab density?)

In any case, I would posit that both the nature of zpages and their
average size relative to a whole page is quite unusual compared to slab.
So while there may be some useful comparisons between zswap
and slab, the differences may warrant dramatically different policy.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/