Re: 3.0 wishlist Was: Overview of 2.2.x goals?

Kamran (kamran@wallybox.cei.net)
Sat, 24 Jan 1998 16:16:28 -0600 (CST)


Hi,

Larry McVoy wrote:
>: Using message passing on SMP machines with _real_ shared memory is not
>: very clever.
>
>Want to bet? Using the /unmodified/ message passing libraries is stupid.
>But what good vendors do is to provide the same interfaces, in a DLL, that
>are optimized for SMP. So you do things like:
>
> msg_send(from, to, ...)
> {
> if (SMP(from, to)) {
> bcopy(get_addr(from), get_addr(to), length);
> } else {
> real_msg_send(from, to, ....);
> }
> }
>
>Then it goes like blazes on an SMP while maintaining compatibility with
>clusters. In fact, you can mix and match a cluster of SMPs just fine.

Nice method, but DIPC takes another approach: It uses an interface that is
simpler to use, and benefits from the multi-processors' shared memory to
the fullest. DIPC resorts to implicit message passing only when there is a
_real_ need, that is, when multiple computers over a network are involved.

>The main reason for their prevalence is
>
> . works everywhere

Yes. There has been a great need for distributed programming tools. Once
such a system becomes the standard in an influential platform, it will spread
everywhere. One important factor here is that they are user-space programs,
and porting them isn't difficult.

> . same programming model no mater what your environment

I believe they have taken a worst case approach by adopting a message
passing programming interface everywhere.

> . heterogeneous system

The programmer has to do lots of work in writing a message passing
application. Doing a bit of data conversion is a small service.

> . ease of use

You should be an old hand in message passing systems. Message passing is not
what most people feel comfortable with. People learn to use shared memory
from the very begining. Global variables and stack variables are the means
of communication between different parts of ordinary programs, and these are
shared memories. Using multiple threads or processes is just a step away.
Message passing can only be easier for people who are used to it.

>Sure it could but you were talking about is big apps. I have a fair bit of
>experience with those apps and I can tell you that the war is way past over.
>MPI won. For good reason.

MS-DOS was the prevalent OS for many years, without being more easy or more
powerful. Could you argue that MS-DOS won a battle against other operating
systems? Message passing systems (not applications) are easier to develop and
maintain, and they result in good performance. People really needed the
distributed applications, even if that meant a lot of work. Now we have a lot
of applications for them. This does not justify them in any way.

>: *) Distributed shared memory is much easier to use.
>
>True for trivial apps. False for real apps. Shared memory (distributed
>or otherwise) is a difficult programming model for people to grasp.
>Quick - how many people realize that coherent shared memory isn't -
>all the state you care about is in the registers of the CPU. So when do
>those get flushed? Is your model store ordered, partially store order?
>Etc., etc. Try and learn from the various SMP kernel experiences - it's
>really hard and takes quite talented people to get it right. Do you really
>want to design a programming model that most people can't use? Seems quite
>elitist to me.

Of course not. DIPC programmers don't have to deal with _any_ such stuff.

Actually, it is using message passing systems that look like an elitist
approcah to me.

>: *) Programs using DIPC can be run in a single computer, even on Linux
>: kernels without DIPC support! There is no need to modify and compile
>: the sources to achieve this.
>
>Ditto for MPI.

Can you run an MPI application without first installing MPI ??

>: *) DIPC programs can automatically use SMP hardware and real shared memory.
>: Again, no need for the modification and recompilation of the sources.
>
>Ditto for MPI, it's been this way for years.

Not an attractive solution for me. Refer to my answer about "same
programming model..."

>: *) DIPC can work in heterogeneous environments (currently x86 and M68k).
>: This is not very common among distributed shared memory systems.
>
>So if I put an int in shared memory on a PC and read it on a SPARC it gets
>byte swapped? I don't think so. You have to have an RPC like XDR layer to
>do this and the kernel sure as hell is not going to do this for you. MPI
>has this, of course.

No, in DIPC a heterogeneous application has to convert the data itself. In
MPI you have to explicictly sent the data to a machine, and this should be
done at the right time, when the data are needed. The programmer also has to
inform MPI of the type of the data being transfered. After expecting the
programmer to do these, converting the data by MPI should only be considered
a small service.

>: *) You don't have to learn some new programming interfaces.
>
>Yes you do - you have to learn shared memory. Which is much, much harder for
>people to grasp then you might think.

I strongly oppose this. Refer to my answer about "ease of use".

>Look, don't get me wrong - I think DIPC is cute. But you shouldn't
>expect the big apps people to get at all interested. It only works on
>Linux, is not supported by SGI/SUN/HP/DEC/etc, and as such is completely
>uninteresting to anyone who's /job/ it is to do the sort of programming
>that you are discussing. People running those big apps are running on
>supported machines. Linux isn't commercially supported.

Yes, it has not been ported to other operating systems, and maybe it never
will. But quoting this does not make message passing systems any better.
There are tons of COBOL code out there, but this won't convince many people
that it was a good language in the first place (except maybe for some people
who have years of experience with COBOL). COBOL solved many problems, but now
has become a maintenance problem for so many organizations.

Now, COBOL was in the mainstream of computer science. Many people needed
such programming languages, and a lot of energy was spent in improving
things. Distributed programming, on the other hand, has been the domain of
some specialists (mostly scientists). Some of them seem content with what
they have and there seem to be much less over-all attention to improve
things there, and least in radical ways.

Maybe Linux is not commercially supported, but I believe it to be very
influential in the computing world.

>That's not to say that this isn't a fun project. But it does throw into
>question the real need for this stuff to be in the mainstream kernel. If
>it isn't going to be used very much then it just becomes more lines of
>code that can cause the kernel to have bugs. We don't want that, do we?

We don't want any bugs, but we certainly do need new functionality. In my
opinion, Linux should not be just another UNIX clone.

Most MS-DOS users upgraded to Windows 95 or NT only when Microsoft wanted
them to. Linux's user community is much more literate and technology aware.
As said before, I believe DIPC can have its (not so low) place in distributed
programming. Actually, Linux is an ideal platform to see if alternate
distributed programming systems like DIPC can be used by programmers
accustomed to writing ordinary applications, not people with years of
experience in programming message passing systems. DIPC's motto is:
Distributed Programming For The Masses! :-)

Times change, equipment and priorities also change. I'd humbly advise you
to get a bit more information about DIPC. A look at some of documents and
example programs will clarify many things. You can get DIPC by anonymous
ftp from wallybox.cei.net/pub/dipc. DIPC's web page is at
http://wallybox.cei.net/dipc.

-Kamran