Re: Inter-Kernel Communications (Multi Kernel Clusters)

=?ISO-8859-1?Q?Anders_=D6stling?= (anders.ostling@neurope.ikea.se)
Tue, 25 Feb 1997 12:09:11 +0100


Hi

> There are some funny special cases that need to be considered. How
> does your proposal handling failing components in the cluster? Imagine
> a network failure that split your cluster in two parts, each fully
> functional but unconnected to the rest. The so called split-brain
> syndrome. Now each cluster half will continue processing and assumes
> it is authoritive? Imagine a database system being split up into
> two systems ...

The VMS clusters handles this very nice by using a voting mechanism where
each member has 0-N votes, and the sum of the votes must be a minimum of
the "quorum" or "expected votes". If not, each cluster member will freeze
until
the quorum has been reestablished. This algorithm is very simple yet very
clever since it works - always !!!

Member A (1 vote)
Member B (1 vote)
Member C (1 vote)

= 3 votes

Expected Votes = (3 (total) / 2) + 1 = 2 (Quorum)

This gives that if one system leaves the cluster (network partitioned),
then the
remaining two nodes still has <Quorum> votes and can continue. The third
node will NOT have enough votes (and does not see any other nodes) so it
will freeze until the connectivity is reestablished.

In a two-node cluster the same scheme can be achived by using a locally
connected (or shared) disk as a voting member with 1 vote.

/Anders