Re: stable? quality assurance?

From: Martin Steigerwald
Date: Sat Sep 04 2010 - 15:33:35 EST



Hi again,

Am Samstag 04 September 2010 schrieb Willy Tarreau:
> On Sat, Sep 04, 2010 at 06:42:19PM +0200, Martin Steigerwald wrote:
> (...)
>
> > The main idea here is to have a two-staged freeze process and to
> > distribute the "I am only taking bug fixes" work to more people than
> > Linus.
> >
> > For this to work properly, I think at the time of the release of the
> > stable kernel subsystem maintainers and Andrew should branch their
> > trees. For example when 2.6.36 is released:
> >
> > - tree
> >
> > => 2.6.36-stable-tree
> > => tree, where 2.6.37 stuff will be going in
> >
> > Thus when subsystem maintainers take new stuff during the merge
> > window, it will be for the next kernel release already, not for the
> > current one. Except bugfix work. Whereas I think the criteria for
> > bug fix work should not be that strict than for the stable patches
> > Greg collects.
> >
> > Thus it needs to be clear: No new stuff for next kernel already two
> > weeks prior to release the current stable kernel.
>
> While I respect your beliefs on this matter (they once were mine too),
> I now realized I was wrong for several reasons :
> - most developers want to create. They (generally) test what they
> create, they believe it's flawless because it works for them. No need
> for more testing, go on with new features ; if you refuse to merge
> their new work for some time, they work on their own tree and push you
> more work at once next time.
>
> - developers need real world use cases. That means more testers.
> Developers are bad testers because they don't trigger the unexpected
> use cases. And how do you get good testers ? by motivating end users
> to test your code. Most testers will only test a new kernel to get a
> new feature. If it works for them, no need to push the tests further.
> So that means that the first RCs are the most tested, and that the
> later ones are the least tested. Thus at one point you can't hope to
> get bug reports anymore. When you see an -rc7 or -rc8, you think "hey,
> -rc4 was OK, let's wait for -final and install it".

That fits perfectly well. If the first rcs are nicely testing, then ideally
all major issues should be done, when rc7 or rc8 are reached. And thus
time can be spent on fixing the major remaining open regression. I guess
those who reported these regression are interested in testing a fix.

For me features have been number one reason to upgrade kernels as well,
but then its not a yes or no decision, but more a tuning on how much new
feature stuff each stable kernel release should have and a way to put a
little bit more attention to making a stable kernel release stable.

> - people concerned by stability don't test every release. They test
> when they can, precisely because they can't impact production. So they
> don't contribute bug reports in time. And as the 2.4 maintainer, I'm
> well aware of that, because when I break something, I only know about
> it 3-4 months later.

How does this affect my suggestion above? If as you say the first rcs are
tested better and if as I assume those who reported regressions have an
interest in testing their fixes, I think this can work out nicely.

Aside from that, I am not sure whether most people step in with rc1 or rc2
already. When I tested rc kernels - there have been some times - I usually
waited to rc3 or rc4 so I could be somewhat confident that really major
issues are fixed already.

> For this reason, I think the release rhythm can't much be changed.

I still object that for above given reasons. And cause I think that if
something does not work out perfectly it still can be improved. But I am
interested in your other suggestions as well, cause maybe its not so much
the release process but something else the issue here:

> I think that trying to evaluate and publish quality per developer or
> maintainer can have a better effect because everyone in the commit
> chain is responsible. But even doing that is hard because some changes
> touch everything and it's not obvious to say that Mr X or Y has done
> some crap.

And who judges on what is crap? Build failures could be tracked
automatically. Partly maybe even performance regression as the automated
tests from Phoronix show. Well boot failures or freezes are even more
important. But then, you are probably not judging the quality of the work
of the developer but the difficulty of the area he works on.

Nix pointed out that programming ATI Radeon cards can be quite
challenging. And I do have lots of respect for the Radeon KMS related
work. So I think it would be unfair to point at one of the Radeon KMS
developers and say to him "you did crap" for example.

I think crap does happen and am more concerned about how to handle it when
it does.

> In my opinion, reporting bugs is the most effective way of improving
> quality. If you report 10 bugs in a week on the same driver, there are
> chances that at one point this driver's author will want to take some
> time to audit his code and find other bugs before you next point your
> finger at him/her. As you see, the goal is not just to report bugs to
> get them fixed, but to educate bug authors.

Okay, my contribution then: I report bugs. I reported 4-5 kernels bugs in
the last time. I reported some before, but only occassionally. I didn't
face that many bugs prior to 2.6.34 which contributed to my admittedly
very subjective impression that kernel quality has lowered.

> I can tell you that I am an author of quite a number of bugs in another
> project (haproxy), and I absolutely hate it when a bug is detected in
> production (especially given the product's goal), to the point that the
> code is generally reworked 2, 3, 5, 10 times before being committed. Of
> course it is still not enough to catch all bugs, but since the product
> has got a widely accepted reputation of being rock solid, I think it
> works quite well afterall.

Interesting project, I am implementing a highly available active/passive
loadbalancer cluster using Corosync, Pacemaker and the IPVS frontend
Ldirectord at the moment currently at work.

> Last, developers must not betray their users' trust. When they're not
> certain of their code, this must be advertised (this is often the case
> but not always). That helps a lot end users select only reliable
> features and experience more stability.

Well for me a balance must be met: A kernel has to work good enough for me
to use it regularily. And currently 2.6.34 upto 2.6.36-rc2 on my ThinkPad
T42 simply do not fulfil that criterium. What annoys me most: Radeon KMS
already works perfectly stable on 2.6.33 for me. So the issue was not in
the initial version of Radeon KMS. It has been introduced afterwards. Thus
a supposedly more matured and stable version of it is working less stable
for me.

2.6.33-tp42-01231-g11b897c has been good to me so far. I am glad it had
not frozen yet. I better press send now.

Ciao,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7

Attachment: signature.asc
Description: This is a digitally signed message part.