Re: Hot pluggable CPUs ( was Linux 2.5 / 2.6 TODO (preliminary) )

From: James Sutherland (jas88@cam.ac.uk)
Date: Sun Jun 04 2000 - 14:53:37 EST


On Sun, 4 Jun 2000, Bruce Guenter wrote:

> On Sat, Jun 03, 2000 at 08:44:37PM +0100, James Sutherland wrote:
> > > So, you've essentially got two complete systems (once you add up all the
> > > components) in a single box.
> >
> > No. I have the same components, but organised to make one single machine
> > with N+N redundancy, rather than a pair of independent machines with no
> > redundancy at all.
> >
> > > What does this buy you above having two completely independant boxes?
> >
> > Redundancy. Your approach gives you two machines, each with, say, 99.99%
> > availability. Mine gives a single machine with, perhaps, 99.9999%. Two
> > machines without redundancy have much lower availability.
>
> I was referring to N machines with network-level redundancy instead of a
> lower-level redundancy (either shared memory or shared bus interconnect).

I know. The point is, there are applications where that just isn't good
enough. You need the one machine to work all the time - switching to
another machine just isn't an option.

For "normal" network tasks, this will usually suffice; HTTP, FTP, SMTP
etc. failover is adequate. If the machine is doing something important,
though, it just isn't possible to failover. Think mission critical control
systems - "hang on reactor, we'll have a spare online in another 30 nanos-
BOOM. Too late."

More to the point, if you have a pair of identical systems with no
redundancy, supposing the system HDD fails in one (killing it), then a CPU
fails in the second (killing it)? A fully redundant machine is fine with
that: it still has a working system drive, and a working CPU. Your
machines, OTOH, are now both offline.

> > > I wouldn't be surprised if a single box with all the redundant
> > > components costs more than the total price of two seperate boxes.
> > Yes - you are paying through the nose for the extra 9s of availability.
> > There are markets where the client is more than happy to do so; in mission
> > critical apps, double the price for an extra 9 is a bargain.
>
> Only double the price? That would indeed be a bargain.

Even at greater differentials. However, double the price for N+N
redundancy gives much better availability than two non-redundant machines.
As soon as any single component fails in one node, EVERY component of the
other machine becamse a single point of failure: with my N+N system, two
of the same component have to fail at once to take me offline. With your
system, ANY two components failing will take you offline.

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jun 07 2000 - 21:00:19 EST