Re: linux-kernel-digest V1 #4524

Peter Miller (peterm@jna.com.au)
Tue, 28 Sep 1999 13:45:03 +0000


Larry McVoy writes
> On Tue, Sep 28, 1999 at 09:41:11AM +0000, Peter Miller wrote:
> > Please note that Aegis models the change lifecycle, it isn't just
> > a file babysitter. It sets out to do a LOT more than simple
> > file versioning.
>
> That's nice. Are you implying that everyone else is just doing RCS?

No, I'm saying there is more to the change lifecycle than the
checkout at the beginning and the commit at the end. Much more.

> BK certainly does more than that, you get full changesets, complete
> reproducability in a distributed environment, etc. So what's your
> point?

Software is more than the mere aggregation of the source files.
What about: the software is supposed to *do* something. What about
input validation? (Does it build, does it test OK kinda stuff.)
What about peer review?

> > instead of hammering on it
> > directly, you work in a change set.
>
> What if I don't want to do that?

NO ONE works directly with their system. Who in their right mind
edits the *,v or s.* files directly? We all work through the agency
of a change lifecycle of some sort. Aegis just makes it visable.

> > > . undo - including things you don't do but incorporated
> >
> > Aegis doesn't have uncomditional commit/incorporate.
>
> Pity for it. BK does. BK is a strict superset of Aegis in this respect.
> With the difference being that we don't encode our policy into the
> mechanism.

Aegis is *only* policy. It's doesn't have any built-in history
tool, it delegates that minor (absolutely essential but minor)
portion of the lifecycle to the tool of your choice. It doesn't
have a built-in merge tool, build tool or test tool, either.

I think you'll find BK is a subset, not a superset.

> Even though I happen to agree, in principle, with the policies
> which Aegis enforces, that doesn't mean that they are right.

Everyone is *always* free to roll their own. But, because they are
not SCM experts, they will more than likely come up with an inferior
solution. (To a very OLD problem I might add. Why do we insist
on re-inventing this particular wheel?) Aegis provides a well-thought-out
flexible shown-to-work set of policies. Not the only policies,
but good ones.

> sticking your methodology in my face.

Aegis can do nothing else, that's all it is.

And this is the whole point: the problems which beset distributed
development are *policy* problems, not file version problems.

Un-buildable commits are not a version problem, they are a policy
problem: How else do you change "you really should not commit
unbuilable changes pretty please" into "you CAN NOT commit unbuildable
changes" (I deliberately don't say "may not" here.)

Un-peer-reviewed commits are not a version problem, they are a
policy problem: How else do you change "you really should not
commit unreviewed changes pretty please" into "you CAN NOT commit
unreviewed changes"

Un-tested commits are not a version problem, they are a
policy problem: How else do you change "you really should not
commit untested changes pretty please" into "you CAN NOT commit
untested changes"

> [Linux done good so far.] Kinda makes your methodology look a
> little silly by comparison. Not wrong, just silly.

Linux got this far using informal policy ("please" plus human
oversight) which doesn't scale, if I understand the reasons stated
for being dissatisfied with CVS. Why consider changing to
*anything*, let alone to Aegis or BK, if things are hunky dory?

Aegis DOES NOT do anything to the process which is *informally*
already there, it just makes it formal. And it automates a fair
whack of the turn-the-handle human oversight.

> Great. I have 1000 different branches, with 10,000 different changes.
> I want to include some subset all changes from all branches in each
> branch. How big is your revision history file? About 9,000 times
> bigger than a BK file doing the same thing.

OK, so use the BK algorithm (maybe raw SCCS if it's been there all
the time) to fold them down. Configure Aegis with the right
commands, and this will happen. Aegis does not have an in-built
history tool. Your history store is as optimal as the history tool
you choose; I'm not forcing the use of any particular history tool
or algorithm.

> > The history tool is decoupled from Aegis.
>
> That's nice. It's also completely unreasonable. There are a number
> of operations which have to plow through the metadata to do anything.

Sure, so don't park it in the history files. That's what Aegis'
database is for. (By being able to cope with just about *any*
history tool, Aegis makes few assumptions about what meta-data they
are capable of minding.)

Again, it's a case of letting the history tool do what it's designed
to do, and Aegis being designed to do the other bits.

> ... It's also completely unreasonable.

I think we have exposed an assumption here, Larry. Back up a
second: ask youself if it can't work ever, or it can't within the
BK architecture. I think Aegis works pretty well, and so do my
users. By doing the impossible? No.

> What happens when [some random file] gets corrupted?

Same as always. Fix the bug and get out the backups.
What other possible choices are there?

> It is a _fact_ that disks go bad and that file systems can corrupt
> files. It is a _fact_ that billons of dollars of IP are put into
> [some random file]

So all apps everywhere should be coded to check something that the
operating system is supposed to do already? And just what do they
do when they find a corruption? Get out the backups.

> Bully for you but hell would freeze over before I'd let you near my source.

The fact is that the vast majority of source code, for decades now,
is stored on those very same flakey disks, with *no* application
verification of disk drive integrity. And yet, somehow, we continue
to churn out software.

Aegis is designed to do *one* thing, and do it as well as it can.
Problems outside its domain (file system, build tool, history tool,
etc) it delegates. You want a history tool with a checksum? Go
ahead. Um, what about your compiler?

> If I have a changeset which has
> N bytes of data changed, where N is measured as "diff -n | wc -c"
> of the changeset, how much data does aegis send across the wire to
> propogate that change?

Depends how you send it, tho it always sends the meta data, too.
It could be email, using gzip -9 | mpack, it could be ftp without
the mpack. It depends.

It also depends on the granularity you chose. Are you sending a
single change set, a whole branch (LOD), or a whole project? They are
all sent using the same mechanism. Of course, each is progressively
larger.

> And how much data was sent across the wire to figure out what to
> send?

Zero.

> And while we're at it, suppose I have 2 repositories with 100,000
> files in them. Suppose that they are uptodate. How long does it
> take Aegis to figure that out?

Under what circumstances? If they are both up-to-date, there would
be no reason to do anything. However, if a change set were
propagated, you have to decide. If it is a simple change set, the
files by definition are different (or maybe not, but it's a small
effort to decde). If it is a whole branch, then the files are
checked for equality, at the receiving end, and the change set
pared down to just the differences. Ditto for a whole project (a
project is simply an un-numbered super-ancestor branch) but obviously
the amount of work is larger.

> > 2. No UNCONDITIONAL commits. Ever.
>
> Jeeze. BK is quite happy to stick whatever you ask it into a
> repository. The reason that it is happy to do so is because it
> can undo that change with a single command.

AFTER the trojan has been commited and acted, which is how I worked
out I need to yank it. Locking the barn door after the horse has
bolted. ``Jeeze.''

> > No way any incoming patch from IDontKnowYou is going into my
> > repository without any preconditions! (Really simple preconditions
> > like "the repository still builds" and "the repository still
> > tests out OK")
>
> And what, pray tell, is wrong with optimizing for the common case, which
> is the changeset is just fine? It seems to me that you are making your
> users do extra work that they don't need to do.

Nothing, except security (you know: "availability, integrity, confidentiality").

Nothing, except the common case is "peer review failed".
(If your reviews *don't* fail, why the heck are you doing them?)
(You are doing peer reviews, aren't you?)

Also, one of the stated problems with the current state of L-K affairs
is an increasing proportion of, shall we say, "inferior" incoming
patches. Weeding them out is the issue - policy again - not file revs.

Actually, Aegis distributed development makes *less* work. It is
perfectly possible, and indeed the recommended practice, to write
a procmail filter which will AUTOMATICALLY perform all of these
validations, and leave it ready for a human to give it the once-over.
Once that's done, the rest of the commit can also be AUTOMATED.

Using Aegis, the essential work of the maintainer, to quote
Linus, is to say "No". All the messy bits (build it, test it, push
the files around) are automated.

<asside>
Before y'all scream about testing kernels, let me just say that
(a) the kernel presents a challenge, (b) I don't have a foolfproof
ways to do this, (c) I *do* use Aegis to develop and test kernel
modules, and (d) yes, there will always be the halting problem.
When I say "testing" and "Aegis" in the same sentence, I'm speaking
generally. Mind you, I suspect that as more and more of the kernel
is modules, a heck of a lot of testing is possible!
</asside>

> and when you are going to integrate bug tracking in,

I'm not. The notification hooks are already there. (Change sets
have always had meta-data, such as descriptions and audit trails,
since the very beginning. For many small projects, this may be
enough, but it's a side-effect, not a design goal. Use the state
transition notifications if you want more.) Aegis does a particular
portion of the process, not all of it.

> mailing list

They already work. Aegis does a particular...

> and web server support, etc.

A CGI glue script is provided. Aegis does...
Are you suggesting I add a *web server* to it?!?

Regards
Peter Miller E-Mail: millerp@canb.auug.org.au
/\/\* WWW: http://www.canb.auug.org.au/~millerp/
Disclaimer: The opinions expressed here are personal and do not necessarily
reflect the opinion of my employer or the opinions of my colleagues.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/