Re: Mercurial 0.4b vs git patchbomb benchmark

From: Matt Mackall
Date: Fri Apr 29 2005 - 11:40:31 EST

On Fri, Apr 29, 2005 at 07:34:15AM -0700, Linus Torvalds wrote:
> On Fri, 29 Apr 2005, Matt Mackall wrote:
> >
> > Mercurial is even younger (Linus had a few days' head start, not to
> > mention a bunch of help), and it is already as fast as git, relatively
> > easy to use, much simpler, and much more space and bandwidth
> > efficient.
> You've not mentioned two out of my three design goals:
> - distribution
> - reliability/trustability
> ie does mercurial do distributed merges, which git was designed for, and
> does mercurial notice single-bit errors in a reasonably secure manner, or
> can people just mess with history willy-nilly?

Distribution: yes, it does BK/Monotone-style branching and merging.
In fact, these should be more "correct" than git as it has DAG
information at the file level in the case where there are multiple
ancestors at the changeset graph level:

M M1 M2

|`-------v M2 clones M
aB AB file A is change in mainline
|`---v AB' file B is changed in M2
| aB / | M1 clones M
| ab/ | M1 changes B
| ab' | M1 merges from M2, changes to B conflict
| | A'B' M2 changes A
| a'B' M2 merges from mainline, changes to A conflict
??? depending on which ancestor we choose, we will have
to redo A hand-merge, B hand-merge, or both
but if we look at the files independently, everything
is fine

> For the latter, the cryptographic nature of sha1 is an added bonus - the
> _big_ issue is that it is a good hash, and an _exteremely_ effective CRC
> of the data. You can't mess up an archive and lie about it later. And if
> you have random memory or filesystem corruption, it's not a "shit happens"
> kind of situation - it's a "uhhoh, we can catch it (and hopefully even fix
> it, thanks to distribution)" thing.

Reliability/trustability: Mercurial is using a SHA1 hash as a checksum
as well, much like Monotone and git. A changeset contains a hash of a
manifest which contains a hash of each file in the project, so you can
do things like sign the manifest hash (though I haven't implemented it
yet. Making a backup is as simple as making a hardlink branch:

mkdir backup
cd backup
hg branch ../linux # takes about a second

> I had three design goals. "disk space" wasn't one of them, so you've
> concentrated on only one so far in your arguments.

That's because no one paid attention until I posted performance
numbers comparing it to git! Mercurial's goals are:

- to scale to the kernel development process
- to do clone/pull style development
- to be efficient in CPU, memory, bandwidth, and disk space
for all the common SCM operations
- to have strong repo integrity

It's been doing all that quite nicely since its first release.
The UI is also pretty straightforward:

Setting up a Mercurial project:

$ cd linux/
$ hg init # creates .hg
$ hg status # show changes between repo and working dir
$ hg addremove # add all unknown files and remove all missing files
$ hg commit # commit all changes, edit changelog entry

Mercurial will look for a file named .hgignore in the root of your
repository contains a set of regular expressions to ignore in file

Mercurial commands:

$ hg history # show changesets
$ hg log Makefile # show commits per file
$ hg diff # generate a unidiff
$ hg checkout # check out the tip revision
$ hg checkout <hash> # check out a specified changeset
$ hg add foo # add a new file for the next commit
$ hg remove bar # mark a file as removed

Branching and merging:

$ cd ..
$ mkdir linux-work
$ cd linux-work
$ hg branch ../linux # create a new branch
$ hg checkout # populate the working directory
$ <make changes>
$ hg commit
$ cd ../linux
$ hg merge ../linux-work # pull changesets from linux-work

Importing patches:

$ patch < ../p/foo.patch
$ hg addremove
$ hg commit

$ patch < ../p/foo.patch
$ hg commit `lsdiff -p1 ../p/foo.patch`

$ cat ../p/patchlist | xargs hg import -p1 -b ../p

Network support:

# export your .hg directory as a directory on your webserver
foo$ ln -s .hg ~/public_html/hg-linux

# merge changes from a remote machine
bar$ hg init # create an empty repo
bar$ hg merge http://foo/~user/hg-linux # populate it
bar$ <do some work>
bar$ hg merge http://foo/~user/hg-linux # resync

This is just a proof of concept of grabbing byte ranges, and is not
expected to perform well.

Mathematics is the supreme nostalgia of our time.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at