RE: [GIT PULL] mm: frontswap (for 3.2 window)

From: Dan Magenheimer
Date: Wed Nov 02 2011 - 16:08:39 EST

Next message: Greg KH: "Re: [PATCH 4/6 v2] PM: Limit race conditions between runtime PM andsystem sleep (v2)"
Previous message: Alex Williamson: "[PATCH] pci: Fix PRI and PASID consistency"
In reply to: James Bottomley: "RE: [GIT PULL] mm: frontswap (for 3.2 window)"
Next in thread: Theodore Tso: "Re: [GIT PULL] mm: frontswap (for 3.2 window)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> From: James Bottomley [mailto:James.Bottomley@xxxxxxxxxxxxxxxxxxxxx]
> Subject: RE: [GIT PULL] mm: frontswap (for 3.2 window)
>
> > Not quite sure what you mean here (especially for frontswap)...
>
> I mean could it be used in a more controlled situation than an
> alternative to swap?

I think it could, but have focused on the cases which reduce
disk I/O: cleancache, which replaces refaults, and frontswap,
which replaces swap in/outs. Did you have some other
kernel data in mind?

> OK, I still don't think you understand what I'm saying. Machines in a
> Data Centre tend to be provisioned to criticality. What this means is
> that the Data Centre has a bunch of mandatory work and a bunch of Best
> Effort work (and grades in between). We load up the mandatory work
> according to the resource limits being careful not to overprovision the
> capacity then we look at the spare capacity and slot in the Best effort
> stuff. We want the machine to run at capacity, not over it; plus we
> need to respond instantly for demands of the mandatory work, which
> usually involves either dialling down or pushing away best effort work.
> In this situation, action is taken long before the swap paths become
> active because if they activate, the entire machine bogs and you've just
> blown the SLA on the mandatory work.
>
> > It's true, those that are memory-rich and can spend nearly
> > infinite amounts on more RAM (and on high-end platforms that
> > can expand to hold massive amounts of RAM) are not tmem's
> > target audience.
>
> Where do you get the infinite RAM idea from? The most concrete example
> of what I said above are Lean Data Centres, which are highly resource
> constrained but they want to run at (or just below) criticality so that
> they get through all of the Mandatory and as much of the best effort
> work as they can.

OK, I think you are asking the same question as I answered for
Kame earlier today.

By "infinite" I am glibly describing any environment where the
data centre administrator positively knows the maximum working
set of every machine (physical or virtual) and can ensure in
advance that the physical RAM always exceeds that maximum
working set. As you say, these machines need not be configured
with a swap device as they, by definition, will never swap.

The point of tmem is to use RAM more efficiently by taking
advantage of all the unused RAM when the current working set
size is less than the maximum working set size. This is very
common in many data centers too, especially virtualized. It
turned out that an identical set of hooks made pagecache compression
possible, and swappage compression more flexible than zram,
and that became the single-kernel user, zcache.

RAM optimization and QoS guarantees are generally mutually
exclusive, so this doesn't seem like a good test case for tmem
(but see below).

> > > This isn't an inherent design flaw, but it does ask the question "is
> > > your design scope too narrow?"
> >
> > Considering all the hazing that I've gone through to get
> > this far, you think I should _expand_ my design scope?!? :-)
> > Thanks, I guess I'll pass. :-)

(Sorry again for the sarcasm :-(

> Sure, I think the conclusion that Transcendent Memory has no
> applicability to a lean Data Centre isn't unreasonable; I was just
> probing to see if it was the only conclusion.

Now that I understand it better, I think it does have
a limited application for your Lean Data Centre...
but only to optimize the "best effort" part of the
data centre workload. That would probably be a relatively
easy enhancement... but, please, my brain is full now and
my typing fingers hurt, so can we consider it post-merge?

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Greg KH: "Re: [PATCH 4/6 v2] PM: Limit race conditions between runtime PM andsystem sleep (v2)"
Previous message: Alex Williamson: "[PATCH] pci: Fix PRI and PASID consistency"
In reply to: James Bottomley: "RE: [GIT PULL] mm: frontswap (for 3.2 window)"
Next in thread: Theodore Tso: "Re: [GIT PULL] mm: frontswap (for 3.2 window)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]