RE: Topic for discussion: OS Design

From: Marty Fouts (marty@dotcast.com)
Date: Mon Oct 23 2000 - 01:24:14 EST

Next message: Deepak Gupta: "PRINTER DRIVER"
Previous message: Andi Kleen: "Re: 2.4.0test10pre4 lockups"
Maybe in reply to: Nick Piggin: "Topic for discussion: OS Design"
Next in thread: Malcolm Beattie: "Re: Topic for discussion: OS Design"
Reply: Malcolm Beattie: "Re: Topic for discussion: OS Design"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

"microkernel" is an unfortunate term. Once upon a time it had a reasonably
well understood technical meaning and then Cutler claimed that NT had a
'microkernel' design and the FUD set in. In the literature I'm familiar
with, (not counting marketing hype,) 'micokernel means two distinct classes
of things, although they are often confused, sometimes in the same paper:

1) An implementation technique, probably pioneered in Accent, certainly
popularized in Mach, currently championed in the GNU Hurd (I think,) and
stemming from a lot of earlier work on capabilities an such like, tracing
back to Multics. It consists, mainly, of the idea of dividing the
'services' of an O/S kernel into servers each in a separate address space
communicating via message-passing RPC mechanism. In this case, 'microkernel'
refers to the nucleus of the system that manages the message passing traffic
and implements the 'virtual machine' layer of the system. (I oversimplify,
but it should do for this discussion.)
2) The general concept of moving service facilities to the other side of the
user/supervisor boundary and limiting the nucleus (that part that runs in
supervisor mode) to a very small set of functionality, usually the bare
minimum necessary to implement the VM and communication.

The problem with 'microkernels', like the much earlier problem with
capabilities based architectures is that there is, in most designs, a
mismatch between hardware architecture and software requirements, most
notably in the cost of making a procedure call that crosses between the
'user-space' services and the microkernel - a penalty that can be doubled if
the services have to make calls upon each other.

The problem stems from a misfeature of most computer system architecture
that has the VM system overlapping the functionality of memory
addressability and memory accessability in such a way that changes to either
require 'heavyweight' operations on the VM hardware (TLBs, page tables, et
cetera.) There have been three 'solutions' to this problem:

1) Do the logical separation into services, but don't use separate address
spaces. This keeps the performance but doesn't' get any hardware memory
protection advantage. It doesn't seem worth trying to retrofit an existing
kernel into such a model simply for the modularity gain that, if the kernel
source is well partitioned anyway, might not be very large.
2) Do the heavyweight message passing, and have people laugh at your
performance.
3) Work *very* hard to find a compromise - which may be possible, but few
people have yet accomplished.

I have had the good fortune of working with one architecture (PA-RISC) which
gets the separation of addressability and accessability 'right' enough to be
able to partition efficiently and use ordinary procedure calls (with some
magic at server boundaries) rather than IPCs. There are others, but PA-RISC
is the one I am aware of. When I last looked, which is when I was still
working on it, IA-64, the newest architecture from Intel, due out "any day
now" still preserved that design, which we had worked hard to get them to
keep.

The PA-RISC architectural approach isn't perfect - there are some limited
resources that we wish we had more of - but we were able to demonstrate very
good 'microkernel' performance. *in the second sense of 'microkernel*. We
did this by taking the Brevix concept of "interfaces" and actually
implementing it in HP/UX, and then running some benchmarks designed to
stress server-to-server communication extensively and were very pleased with
the results.

So, the long answer to your question is:

1) a new O/S designed from the ground up in a 'microkernel'-ish way, like
QNX, which doesn't actually do the memory partitioning, or which carefully
designs the components to minimize communication across protection domains,
can be very efficient, but runs into difficulty when its initial assumptions
are violated. (See, for instance, the history of Chorus.)
2) Given the right hardware, it is possible to partition an O/S so that very
little of it at all is 'nucleus' and the vast majority of it is loadable
modules - and you can use simple loader directives to decide if a module
shares address space with the nucleus or lives in a separate address space.
- I hope that Itanium will end up that way, but doubt that the HP work will
survive the marketing decisions that Intel has had to make.

Linux isn't really a good basis for a nucleus/server design, because it is
already pretty well partitioned from a source-code point of view, but is
based on a long history of optimization for all-in-one-address-space
'kernels.'

By the way, even highly decomposed very modular systems aren't as flexible
as the people who first developed them expected them to be, but the reason
behind that is for a whole other discussion.

Marty

-----Original Message-----
From: Dwayne C . Litzenberger [mailto:dlitz@dlitz.net]
Sent: Sunday, October 22, 2000 3:59 PM
To: Peter Waltenberg
Cc: linux-kernel@vger.kernel.org
Subject: Re: Topic for discussion: OS Design

On Mon, Oct 23, 2000 at 08:53:26AM +1000, Peter Waltenberg wrote:
> Use the GNU Hurd, it won't run on most hardware you'd like to use, and
it's
> probably slower than Linux, but it's a microkernel.

I'll ignore that.

> I've worked with microkernels, IMHO, they suck :). Good idea, but
fundamentally
> flawed. The same things that make them more robust (and they are more
robust)
> also kill performance.

Could you elaborate? AFAIK, both Neutrino and exec.library are
microkernels,
and they by no means lack performance. Even Windows is a microkernel (sort
of), and it doesn't lack in performance that much.

-- Dwayne C. Litzenberger - dlitz@dlitz.net

- Please always Cc to me when replying to me on the lists. - Please have the courtesy to respond to any requests or questions I may have. - See the mail headers for GPG/advertising/homepage information. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/

Next message: Deepak Gupta: "PRINTER DRIVER"
Previous message: Andi Kleen: "Re: 2.4.0test10pre4 lockups"
Maybe in reply to: Nick Piggin: "Topic for discussion: OS Design"
Next in thread: Malcolm Beattie: "Re: Topic for discussion: OS Design"
Reply: Malcolm Beattie: "Re: Topic for discussion: OS Design"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Mon Oct 23 2000 - 21:00:20 EST