mergemem utility: tests and BUGS

Marnix Coppens (maco@telindus.be)
Fri, 20 Mar 1998 09:30:01 +0100


I've done some hasty experiments with the mergemem patch on a 2.0.33 kernel.

Judging by the release number (0.03), this is still very much ALPHA code,
and from what I've witnessed, there is indeed some more work to be done
(this is certainly not meant as a discouragement, keep coming with
those ideas..).

I'll be doing some serious debugging during the weekend, but the following
may be worth looking into:

1) Right now, a /proc/mergemem file is created with a (dummy) read operation,
an open (and close) operation and a write operation that accepts a certain
struct, whose members may change as a result. The return value of write()
is used to indicate success or failure.

Instead of this, it *really should* be a char device driver, with just three
operations: open, close and ioctl. The ioctl operation is just right for
this sort of stuff. The driver can then accept two kinds of ioctl requests:
checksum and mergemem.

2) This is probably something weird about shared libraries, but I had hoped
a lot more of their data could have been shared. Try the following:

$ dc &
$ dc &
$ mergemem -n dc
...
comparing mapping 400a9000-400dc000 (51 pages, 204 KB)
pid 9160: 0p merged, 1p shared, 0p errors, 50p not merged
pid 9159: 0p merged, 1p shared, 0p errors, 50p not merged
...

The extract shown above belongs to the libc.so.5.4.23 inode (20451):

$ cat /proc/9159/maps
...
40008000-400a3000 r-xp 00000000 03:02 20451 --> 155 pages
400a3000-400a9000 rw-p 0009a000 03:02 20451 --> 6 pages
400a9000-400dc000 rw-p 00000000 00:00 0 --> 51 pages
bfffe000-c0000000 rwxp fffff000 00:00 0 --> stack

$ size /lib/libc.so.5.4.23
text data bss dec hex filename
515768 137534 207900 861202 d2412 /lib/libc.so.5.4.23
(126p) (34p) (51p)

Apparently, 50 pages out of a total of 51 in the bss section (!) cannot
be shared between the two instances of dc, who only differ in their pid.
Can someone explain this ?
Looks like only a dump of those particular pages will tell the difference.
FWIW, I also noticed that the checksum operation failed a few dozen times
with the error MERGEMEM_NOPAGE1. This is really odd.
BTW, I've noticed that the get_phys_addr() routine being used for this
is a straight copy from fs/proc/array.c, so that should be all right..

3)[BUG] This may be related to the error above, but in certain circumstances,
when you *keep doing* mergemem on two apps, it just keeps finding
pages to share, although in reality there aren't anymore after the first
time. As a nasty side effect, though, the RSS of the second instance
keeps decreasing until it wraps and becomes really high again :).
Luckily, this doesn't crash neither kernel nor application,
but it sure isn't healthy.

Try this:

Create two rxvt's (or xterm) and find their pid. Leave one of them alone,
and in the other one, start the following:

$ while true; do mergemem -p pid1 -p pid2; done

This will cause this instance of rxvt to change its contents a lot.
The mergemem output shows it keeps on finding 376 KB to share every time.
Running top or ps -aux tells a different tale. Notice what happens to the
RSS of the other instance...

---
To conclude, this can really be a useful tool, but right now, it's
more like an uncut diamond :^). As promised, I'll look deeper into it
during the next couple of days and see what exactly happens.

Marnix Coppens

---
Reality is that which                   | Artificial Intelligence
when you stop believing                 | stands no chance against
in it doesn't go away. (Philip K. Dick) | Natural Stupidity.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu