Re: [2.6 patch] i386: always use 4k stacks

From: Helge Hafting
Date: Mon Dec 19 2005 - 03:51:07 EST


Ray Lee wrote:

(Man, I've been holding my tongue on this conversation for a while,
but it seems my better angels have deserted me.)

On 12/15/05, Lee Revell <rlrevell@xxxxxxxxxxx> wrote:


Bugzilla link please.



No, that's not how failure engineering is done. A guy designing a
bridge doesn't cut all the supports back to the bare minimum just to
save money because his design says that the remaining metal should be
strong enough. If you can't prove it, and it's a safety issue
(continuing my analogy in the physical world), then you engineer for
failure. Can you handle all occurrences? No, a hurricane Katrina comes
along every once in a while. Can you weather more than you did before?
Yes. In the meantime, their are fewer poor sods falling off the bridge
that have to open a bugzilla report.


This is quite different - you know much better what stack loads
the kernel may get into. A bridge gets all sorts of weather,
with the very extreme cases occuring now and then. With the
kernel, you can look at the code and determine the maximum
possible stack depth that can ever occur. It won't get deeper
even in some very rare case.


The world of software is no different. If someone wants to remove the
8k stacks option, they'd better prove that they're making my servers
more reliable. I've seen zero arguments for why 8k stacks is unviable.


Well, would you like the kernel to kill your webserver (or whatever
important app) because it attempted to fork at a time where
no two consecutive pages could be found? That happens
occationally in real life - with 8k stack. Going to 12k stack, 16k stacks,
or (shudder) 64 stacks would make that much worse than it
is today.

(I've also wondered why we can't just have IRQ stacks plus 8k thread
stacks -- seemingly the best of both worlds) Instead, what I've seen
is that we have coders who don't like the idea of any non-order-zero
allocations taking place, because big systems running poorly coded
Java apps with massive threading can hit problems with allocations
from time to time.


You don't need big threaded apps for this to happen. All
you need is a handful of forking apps and memory
fragmentation. Many common server apps (web, mail, fileserver,...)
tend to use forking. Massively threaded apps like 4k stacks simply
because that saves 4k for each of the many threads.

The answer for that is the same answer the kernel community usually
gives about poorly designed userspace applications: rewrite them.

I'm quite open to being proved wrong. If someone has a counter case
they can toss forth, please do so. Systems taking lots of interrupts?
Then how about 8k + IRQ stacks? With a counterexample I'll gladly
concede that I'm an ignorant slut[*] -- excuse me, Saturday Night Live
flashbacks -- an ignorant git, and shut up. ([*] is only half right,
I'm not all that ignorant).

If someone doesn't show a counter case, then may I suggest people
consider the possibility that this is not proper engineering. Prove
it, or provide a safety blanket. But don't yank the blanket without
proving the lack of problem.


Well, failing order-1 allocations is _the_ counterexample. IT
never happens with 4k stacks unless you run totally out of
memory, and then nothing can save you.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/