Re: The Linux Kernel Project Management System (INITIAL PROPOSAL)

Larry McVoy (lm@bitmover.com)
Tue, 28 Sep 1999 10:45:20 -0700


On Tue, Sep 28, 1999 at 06:11:51PM +0200, Henning P. Schmiedehausen wrote:
> lm@bitmover.com (Larry McVoy) writes:
> >Cool. So explain to us how Aegis handles the following:

I dunno how Aegis handles all of these, that's why I asked. I have some
hunches though, but I thought Peter could provide real answers.

I can say how BK does it (briefly):

> > . disconnected operation with updates to all repositories

All repositories have history, the history is a graph, changes are
propogated as subsections of each graph. The graphs are restored
identically in all repositories, so you can run md5 over the revision
history files and compare them.

> > . undo - including things you don't do but incorporated

We're changeset based,which means what you think of as a patch comes in
as a recorded, immutable, chunk of work. You can do all operations on
changesets: send them, receive them, browse, diff, and undo them.

> > . multiple views, each of which include some subset of the others

That's what LOD are - a named branch into which you can pull changes (or
actually type them in :-)

BK (really SCCS) is actually a set oriented system. To get a version of the
file, you determine the elements of the set and ask the file to give you
that. Each delta is an element, and the normal way of getting the set is
to start here and work your way back up to the root of the graph.

But you don't have to do it that way - you can take elements from any and
all branches to make up a delta. You can create a set which has never been
seen before if you like.

An LOD is the recorded history of the sets which have been instantiated
in this branch. Normally, you don't think about that, you just do checkin
and checkout. But when you are putting together a release, you may well
create an LOD and pull bits and pieces of other LODs together to create
the release.

> > . compressed repositories

We gzip the data portion of the file (typically 95% of the file is
data but it can vary so if you typed more comments than data then the
compression won't help). The kernel history - WITH NO COMMENTS -
compressed from 140MB to 38MB. With comments, I'd guess that the
compression would be more like 45-50MB (i.e., less good).

> > . disk and/or file systems which corrupt the underlying files

We have a really bad 16 bit checksum over the entire revision history
and then another checksum over the data in each version of the file.
The checksum is this stupid 16 bit thing because that is what SCCS uses
and we are compatible. If we ever hit a case where it doesn't catch an
error, we'll break compat and do adler32 (what gzip uses). We have a
pretty good mechanism for checking if it is broken, we can check each
delta checksum.

> > . low bandwidth links

When using ssh, we compress. Regardless, we send data which is very close
to the number of bits you added (look at diff -n output, it's like that).

-- 
---
Larry McVoy            	   lm@bitmover.com           http://www.bitmover.com/lm 

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/