Re: [RFC] Kernel version numbering scheme change

From: Alex Howells
Date: Wed Oct 22 2008 - 04:59:09 EST


Hey Valdis

>> Requirements for me to put a kernel on a given server would be:
>
>> * supports the hardware
> The problem is that "supports" is often a fuzzy jello-like substance you
> try to nail to a tree. You mention the R8169 and e1000 drivers - if they
> bring the device up, but have issues under corner cases, is that "supports"
> or not?

Oh agreed, this is all very "use case" specific. I'm making all of the
following statements based on the specific hardware we use, and assuming
'stability' based on the kernel/hardware passing a number of tests.

>> * no security holes [in options I enable]
> Similarly for "no security holes". At *BEST*, you'll get "no *known* *major*
> security holes", unless you feel like auditing the entire source tree. There's
> a whole slew of bugs that we can't even agree if they *are* security bugs or
> just plain bugs - see Linus's rant on the subject a few months back.

Agreed. No *known* *major* security holes is fine here.

>> * works reliably, under load/stress.
> And you win the trifecta - I don't think we've *ever* shipped a Linux kernel
> that worked reliably under the proper "beat on the scheduler/VM corner case"
> load/stress testing. Again, the best you can hope for is "doesn't fall over
> under non-pathological non-corner-case loads when sufficient resources are
> available so the kernel has a fighting chance". Doing 'make -j100' on a
> single Core2 Duo is gonna be painful, no matter what.

Well the typical tests outlined above are:

* random size file creation/deletion, lots of files
* memory allocation, and freeing up again
* stressing the CPU a bit with one process, then
forking 25-50 processes to (trivially) test scheduler
* testing network I/O by rapidly/concurrently fetching
many small files via HTTP, and a few large ones.

The end goal is simply to get a server which doesn't crash under
"normal" operating conditions. The bugs I referred to in
e1000/forcedeth and r8169 either stop it PXE booting (a requirement for
our environment!) or can *easily* be made to oops / stop working.

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/