Re: Lets get this right (WAS RE:MOSIX and kernel mods)

Jeff Millar (jeff@wa1hco.mv.com)
Sat, 06 Mar 1999 17:57:51 -0500


At 10:26 AM 3/6/99 +0000, you wrote:
>We need to all put the guns down and think a bit.

Agree with that one

>
>Moving to a distributed approach to computing is a fundamental change,
>and if we make the wrong design and architecture decisions now, we
>will be stuck with them for a while. If commercial linux apps come to
>rely on a distributed API we provide, we had better be sure that it
>doesn't suck, because as the user- and software base for linux gets
>bigger and bigger, these things get harder and harder and more and
>more painful to put right. Thinking "Linus will make the right
>descision" is a cop out. We need to think this through ourselves.

Not so fundamental. We have mmap'd files that in many ways mirror the
effects of DSM...access time depends on use patterns.

>
>To summarise, two lines of argument have emerged so far in this
>debate:
>
>1)"DSM sucks because it encourages bad code. It shouldn't go in"
>
>2)"DSM is an attractive abstraction because it extends stuff we use
>already on a single node to work on the network. It can suck, but not
>too badly. It should go in."
>
>For what its worth, I am firmly in the former camp. Linux as an OS is
>all about excellence. It doesn't have to cater to commercial
>considerations about time(or cost)-to-market because of the way in
>which development is done. It doesn't have to provide a second-best
>solution because that's all "the development team" has been able to
>come up with. It doesn't have to put something into the kernel just
>because it makes life easier for some people in the short term, and we
>all know that we can't afford to accept sloppiness just because people
>say "Well, it's a compile option, you can just leave it out."
>
>We need a solution that we can live with for the next ten years, not
>one that works now, and then we have to ditch at the next major
>release with much gnashing of teeth.

Within the next year, we'll have networked computer chips that
use 1-4 Gbit per second serial links between them. Imagine
a SIMM/DIMM kind of thing holding 8 CPU chips each with 64 MB on die
...plug in as many as you like on your motherboard. The interconnect
protocol uses DSM so it looks like an SMP.

Crank that projection out 10 years and you'll have multiple processors
interconnected with DSM on a single chip. The complexity limits of
superscalar/multiscalar have already been reached. Designers can
no longer propagate a pipeline interlock signal across the chip in a
single clock...we will go SMP on a single chip. (prediction: Look at
Transmeta with this in mind when they make their announcement).

DSM "sucks" because it has very long latency for remote access compared
to local memory. But that's just part of the continum of L1, L2, cache,
disk, network latency. We have well known protocols and optimizations
for each. In 10 years, when CPU clock at 10 GHz and buses still clock
at about 100-500 MHz because of physics, more and more layers of the
interconnect look more like networking and less like connected memory.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/