Re: Standard Development Integration

From: Marco Colombo (marco@esi.it)
Date: Fri Jan 21 2000 - 16:06:10 EST


On Tue, 18 Jan 2000, Peter Samuelson wrote:

> Date: Tue, 18 Jan 2000 11:56:00 -0600 (CST)
> From: Peter Samuelson <peter@cadcamlab.org>
> To: Marco Colombo <marco@esi.it>
> Cc: linux-kernel@vger.rutgers.edu
> Subject: Re: Standard Development Integration
>

[... we agree on what self-contained means]

[me]
> > > > Months later, when 'devel' is out, and they're at 90%, what
> > > > should they do? (remember, we're in the "not-self-contained"
> > > > case, here).

[Peter]
> > > Start pushing for inclusion into the devel release. (I.e. start
> > > sending Linus patches.) This is actually the ideal case.

[me]
> > Really, 2.3.0 was devel only by name. No real changes went in it.
> > If you were at 90%, you needed realworld testing. You could not at
> > the same time change your code to follow early 2.3 changes (by
> > assumption, they broke your code). You had to choose. (You could do
> > both things, if you had infinite developer time, of course).

[Peter]
> You still have to follow the development cycle as best you can.
> Sometimes this is easy, sometimes it is hard. If the subsystem you are
> interested in is mature, interface changes probably won't be
> catastrophic and you won't lose a great deal of your work when you have
> to adapt. If the subsystem you are working with is still young (like
> USB), you have to expect a certain amount of volatility. Notice that
> I'm now starting to allude to development cycles of individual kernel
> subsystems, as opposed to that of the kernel itself, and they're not
> always the same thing.

Agreed. Right now, being the kernel cycle quite long, if you're
out of sync, it can be by 6 months or more. I'm proposing a way to
reduce this (halven it), without shortening the devel cycle of a
single kernel branch.

> > major changes happen at 2.2y+1.quite-early. This means at
> > 2.2y-1.right-after-feature-freeze time. If you're at 50%, you can't
> > make it for 2.2y. 2.2y+1.quite-early breaks your code. You're
> > completely out of sync. The difference here is only in that: 1) you
> > knew it before - you started your project late in 2.2y-1 life. You
> > should have delayed it in order to get in sync with 2.2y+1. This
> > means taking part in the initial discussion of what will change in
> > 2.2y+1. In my proposal, that's 1-2 months. Right now it can be 6+
> > months. (Think you want to start it *now*, 4-6 months or more before
> > the first 2.5).
>
> What you seem to be wanting is a way for a developer to time his
> development cycle to be at the starting-out stages when the kernel is
> at its most volatile, and near-stable by the time the kernel is
> near-stable.

Yes. Or better, a way to offer the right kernel to the right developer.
An almost mature one to developers with an almost mature project.
A young one to developers that are just starting. The latter easily
becomes the former, of course (at later time)...

> One way this breaks down is that the whole kernel doesn't necessarily
> "break" at once. In 2.3, Linus started with the cosmetic wait-queue
> stuff, then moved on to the page cache, which affected filesystems and
> mm, and by the time that got back on its feet, he started doing arch
> merges. Other things came in here too, which I have had too little
> sleep recently to remember clearly. PCMCIA joined the fold. IPC got a
> facelift. Then SCSI request queues got overhauled. Then after they
> had settled down a little, Al Viro shuffled the block dev API around.
> 32-bit UID's came in recently.

Yes, and that's somewhat another mistake. Too many changes on a single
kernel. Being so many, they can't happen all at the same time, of course.
Yet they have to make them, since the next devel is so far away in the
future. Delaying the 32-bit UID's to 2.5 does not make sense, right now.
Following my schema, 2.5'd be much closer (and of course 2.6, too).
Moving such (late) changes to the soon-to-come 2.5 would make sense.
And there'd be less to be tested on 2.3, so good chances of having a
stable 2.4.early. Almost all advantages of a short devel cycle apply
here. Still the devel cycle is about the same in lenght. You just
spread the efforts. Of course this way you've got less man-time per branch.
That's why it takes almost the same to make half of the changes.
It's no magic at all. Just make life a bit easier for developers.

> The pattern here is that not *everything* broke in 2.3.1 or 2.3.2.
> Major subsystems have pretty much taken their turns. So if you're
> developing a USB driver, you didn't get screwed in 2.3.1 or even 2.3.10
> -- the USB rewrite was much more recent.
>
> My point is that you *can't* time your development cycle around when
> you expect "your" subsystem to break horribly. So any attempt to do so
> is wasted effort.

Right now, I agree it's impossible. If you have always two kernels to
choose, it's somewhat easier.

> > 2) since the time between two releases is half of what it is now,
> > there are less 'major changes' in every initial devel release. So
> > it's easier for you to eventually modify your code to fit the
> > changes. Or, you have half the chance your code will be affected by
> > the 2.2y-1 -> 2.2y+1 move, which, in the long run, is the same.
>
> Pick your flamewar. Length of kernel devel cycle is a whole different
> one. Linus *has* started making efforts to shorten it.

I've never said i want to shorten the devel cycle. I'm just claiming
my schema has almost the same advantages. Of course there's a price
to pay, but it's not the one of a much shorter devel cycle.
For each release, call it x.2y, the time between the first x.2y-1 and
the first 2.2y is almost the same it is now. It's just that 2.2y+1
will start much before it does now.

[Peter, i think]
> > > > > In other words, the act of opening a new development branch does
> > > > > not automatically break things.
[me]
> > > > It did. 2.3.x broke a lot of things soon, with x small if not 0.
> > > 2.3.7 was the page cache breakage, which wreaked havoc on mm and
> > > filesystems. Before that we just had wait-queue cleanup, which was
> > > almost sed-script material.
> > So? At the time 2.3.7 was out, you had to choose. Change your code to
> > work on it, or go on testing on 2.2. At 60+%, you could only go on on
> > 2.2. At 10% it really doesn't matter if you have to redisign it.
>
> Not so. At 60% you *can* start tracking development kernels. Even if
> they're fairly volatile. Or you can choose to port when the time

if you're lucky...

> comes. That's been done too. I think you overestimate the difficulty
> of porting your own code from one stable release to another. (Unless
> you are working in a highly volatile subsystem like USB, that is.)

I'm just considering the worst case. Of course this means overestimating
on the general case. I DO think that most of the kernel is modular,
and so things can be easily ported to later releases...
but there are things that can't be ported, and there we will fail.

> > What was the state of ReiserFS, RAID 0.90 at 2.3.7 time? (I really
> > don't know. I'm just wondering if they were too advanced to follow
> > the 2.3 changes...)
>
> Not sure. ReiserFS was mostly release-ready, but the reiser people
> apparently didn't follow kernel development too closely and were taken
> somewhat by surprise around 2.3.25 or so when Linus announced the
> intent to freeze soon. Hans practically accused the ext2 people of
> changing the page cache and ext2 simultaneously just before feature
> freeze, effectively shutting out reiserfs from getting into Linux 2.4.
> (There was no conspiracy, of course. Linus set him straight.)

You see. Suppose it happened 6 months before. ReiserFS people,
being far from release-ready, could handle that much better. And by
the time the page cache and ext2 changes were ready, 2.5 could be there.
No need to change the 2.3 the ReiserFS people were working on.
I don't know what the (time) distance is between 2.3.25 and 2.2.5.
Maybe it's close enough to 6 months to fit my schema well.

> RAID 0.90 was hopelessly broken by the page cache stuff, but that's
> hardly a fair case study since Ingo Molnar (RAID maintainer) is the
> same guy who *wrote* most of the page cache stuff. He just didn't have
> enough time to update RAID until recently.

Agreed. Ingo is not more scalable than Linus is.

>
> Peter
>

.TM.

-- 
      ____/  ____/   /
     /      /       /			Marco Colombo
    ___/  ___  /   /		      Technical Manager
   /          /   /			 ESI s.r.l.
 _____/ _____/  _/		       Colombo@ESI.it

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jan 23 2000 - 21:00:26 EST