RE: Re: swap cache

David Lang (dlang@diginsite.com)
Mon, 21 Dec 1998 15:28:07 -0800 (PST)


-----BEGIN PGP SIGNED MESSAGE-----

One flaw I see in the "we can turn it off whenever we want" logic is that
if you compress something when you are IO bound (no loss) then get CPU
bound and need to access it you are no needing to uncompress while you are
CPU bound. There is no way to predict this potential loss (although
depending on the compression it may be very fast to uncompress)

David Lang

"If users are made to understand that the system administrator's job is to
make computers run, and not to make them happy, they can, in fact, be made
happy most of the time. If users are allowed to believe that the system
administrator's job is to make them happy, they can, in fact, never be made
happy."
- -Paul Evans (as quoted by Barb Dijker in "Managing Support Staff", LISA '97)

On Mon, 21 Dec 1998, Paul R. Wilson wrote:

> Date: Mon, 21 Dec 1998 14:37:11 -0600
> From: Paul R. Wilson <wilson@cs.utexas.edu>
> To: Jason Chaw <jchaw@acm.org>
> Cc: linux-kernel@vger.rutgers.edu
> Subject: RE: Re: swap cache
>
> >From jchaw@acm.org Mon Dec 21 08:35:39 1998
> >From: "Jason Chaw" <jchaw@acm.org>
> >To: <Paul.R.Wilson[SMTP:wilson@cs.utexas.edu]>,
> > <linux-kernel@vger.rutgers.edu>
> >Cc: <wilson@cs.utexas.edu>
> >Subject: RE: Re: swap cache
> >Date: Mon, 21 Dec 1998 22:30:45 +0800
> >Message-ID: <000401be2cee$7ebce5e0$0137a8c0@ddns.org>
> >
> >Today all workstations and servers have alot of memory, diskspace, bandwidth
> >(memory bus, disk I/O bus, etc).
>
> Not true. Many people have uses for big VM, myself and many other
> workstation users included. Give 128 meg, and I can write a simulator
> for my research that will use it all. ISP's care about memory and
> disk, too. And many people who run piggy applications, like Photoshop.
>
> If you do have plenty of RAM and disk, you should think about putting
> it to better use. For example, make your virtual memory system
> recoverable, and use your disk for logging. Compress pages so that
> that read performance doesn't matter, and so that logging is cheap
> and only bandwidth-limited. Encrypt your swap, and use the compression
> to make the encryption cheaper. Swap redundantly to multiple servers,
> so that you can tolerate a server crash, and restore on another machine
> in case of client failure.
>
> I don't think it will ever be true that we have enough memory or enough
> disk speed---only a lack of imagination about how computers could be
> better.
>
> One good example of a crying need for compressed virtual memory is Java.
> What happens when you implement a lot of OS-like functionality portably
> on top of a Java VM? You need more RAM, that's what. :-) (How much
> should you buy? As much as you can.) Try running nontrivial servlets
> on a JVM, and look at your RSS. Wow.
>
> The Java people are very concerned about this. In general, Java is a
> pig, and very favorable to compressed caching. Garbage-collected langauges
> make a fundamental space-time tradeoff, trading space for speed. Most
> of that space is wasted most of the time. Generational garbage
> collection helps, but the tradeoff is fundamental and can't be
> entirely avoided.
>
> I originally published the idea of compressed caching for virtual memory
> almost 10 years ago, motivated by concerns about the performance of GC'd
> systems. I pointed out that compressed VM gets better as CPU's get faster
> relative to disk, and that it's especially good for GC'd heaps. Now CPU's
> are very fast relative to disks, and Java is making garbage collection a
> serious issue that lots of people care about. The time is ripe.
>
> >Hence the efficacy of compressed virtual
> >memory(VM) is no longer realised. As Rik pointed out, the main problem is
> >drive seeks. We should minimize on drive seeks. To merely compress the pages
> >of virtual memory is of little use. Disk reads / writes will not be
> >optimized.
>
> See my previous postings. A major goal of compressed caching is is exactly
> to reduce seeks, though reduced bandwidth costs matter too, e.g., when paging
> over a network, or logging a recoverable VM.
>
> >For systems of low memory, The situation will be worse performance as the
> >limited memory resources will be further strained to accomodate the data
> >structures of the compression algorithm.
>
> This is not true. The data structures of the compression algorithm don't
> have to take up much space at all. For example, we usually get about
> a factor of two in compression for in-memory data with a 64-byte dictionary.
> I kid you not. Not 64 kilobytes (which isn't that much anyway), but
> 64 *bytes*. And the code is only a few KB. Even if you use traditional
> Lempel-Ziv style compression, you can get about a factor of two with
> a dictionary of 8KB and indexing structures of a few KB more. When
> we're talking about tens of megabytes of RAM, this is just negligible.
>
> And if you're worried about the metadata for the compression cache,
> don't be. This is a very easy allocation problem, because blocks don't
> have to be stored contiguously, and can be moved. (This is one of the
> few cases where a buddy system might be a good idea, because you can
> easily pre-fragment the blocks into conveniently-sized hunks, putting a
> very tight bound on the fragmentation and the size of the metadata.)
> Even a brute-force allocator that used sliding compaction would probably
> be plenty fast enough. Memory bandwidth is now cheap relative to disk
> costs, for most purposes.
>
> >Likewise for CPU resources.
> >The idea is to implement a VM which is optimized for swap in/out of the swap
> >partition. This will require simple and efficient algorithms.
>
> What makes you think I'm talking about anything complicated? Compared
> to the current complexity of Linux's VM implemetation, what I'm talking
> about is pretty minor. The basic adaptivity code is a few hundred lines,
> and is already written. A few hundred more lines would make it able to
> turn itself off completely in any situation where it's a lose.
> The compression algorithms are a couple of thousand lines, and are already
> written.
>
> My problem in getting it into Linux is figuring out what's already there,
> and why, and what I have to change and shouldn't change.
>
> Keep in mind that the timescale of a virtual memory is milliseconds, in
> terms of an individual disk seek, while the timescale of a CPU is nanoseconds,
> in terms of an instruction cycle. That's six orders of magnitude. If
> we execute thousands of instructions per fault, it's absolutely negligible.
>
> What would you do if you could afford to spend a *million* instruction
> cycles to save *one* disk seek? You *can* afford to do that on a modern
> machine, when it's I/O bound. The decision-making code is only a few
> thousand instructions per fault, and could probably be cut down to
> hundreds. The compression and decompression and decompression cost is
> the main thing---all the smarts are really cheap. And we can compress
> a page and decompress another in 200 microseconds. Even if that gets
> in your critical path, it usually won't hurt much.
>
> >We have to understand why we use VM.
>
> I think I understand VM (and garbage collectors and memory allocation
> and persistence and recoverable VM) about as well as anybody in the
> research world. What I don't yet understand is how Linux works, exactly. :-)
>
> | Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (wilson@cs.utexas.edu)
> | Papers on memory allocators, garbage collection, memory hierarchies,
> | persistence and Scheme interpreters and compilers available via ftp from
> | ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
>

-----BEGIN PGP SIGNATURE-----
Version: PGP for Personal Privacy 5.0
Charset: noconv

iQEVAwUBNn7ZiT7msCGEppcbAQGpDwf/fBICW+CvMmTNJML593kquXZROWr/2jFg
+l8xRNkqUX8tAVigznGo8TGXoY6zzwEVslAlmuICuF1Ks10b1W+ei25QKw1DdiOc
HkRFzOBQiBlKYBag3mCjPmUoKHaLtpWRtLLGa1SZZhzTiBY20NG72tKDV2kZS95x
drTxsxxkLW11hmrxYB7V1kldjX/uCX2M7e4N69/Sj+xfAZqvJkkSPkPYW/SPicfY
KEHRycnJzKwNLAolkMNRrD1+8yiIJsabzjh7uGMckM8U9sH04CWOi7o/zONcqfjf
ps4mJSEeYEsH3vw2+Hsg9cmlQ+5365URjz0MN3vAraXIjkICoa5ubg==
=9X8M
-----END PGP SIGNATURE-----

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/