[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emergency Broadcast System



>Message-Id: <[email protected]>
>Subject: Emergency Broadcast System
>Date: Fri, 22 Oct 1993 13:36:33 -0600 (MDT)
>From: "Philippe_D_Nave" <[email protected]>


> 1) Mailing list data should be distributed to "N" sites where "N" is a 
>    magic number that minimizes the chance of losing all the copies.
>    (Mathematicians, sharpen your pencils!)

I found the equations I mentioned earlier.  The following is a latex
file giving some.  The trend should be obvious.  You don't need many
nodes, if you can repair broken ones quickly enough.

[This is why Stratus has such a large FedEx bill.  When one of our machines
gets a broken component, it phones home automatically, reports the failure
and a new part is FedEx'ed immediately.  If the report comes in by
midnight, the new part is on the customer's desk for him/her to replace by
9am.  (Our current machines have N=2, K=1).  We prefer high speed mailaing
of parts (from a warehouse at FedEx's central sorting airport) over normal
mail in order to maximize $\mu$.]

------------------------------------------------------------------------------

\documentstyle[12pt]{article}

\begin{document}

\title{MTTF of Various Systems}
\author{Carl M. Ellison \thanks{Stratus Computer Inc., 55 Fairbanks Blvd.,
Marlborough MA 01752. Email address: {\tt [email protected]}.}}
\date{June 18, 1993}

\maketitle

\begin{abstract}
Equations are presented for the Mean Time To Failure (MTTF)
of various systems, as a function of
the number of nodes in the system, N, and the minimum number of nodes
in a working system, K.  Failure of a system is defined as having fewer
than K working nodes.
\end{abstract}

\section{Equations} 

Here are some equations for the Mean Time To Failure (MTTF) of various
systems, as a function of the number of nodes in the system, N, and
the minimum number of nodes in a working system, K.  Failure of a
system is defined as having fewer than K working nodes.  Typically,
K=1 and each node has a complete copy of each database.  However,
sometimes the data can be kept on multple nodes (as in a RAID-5 disk
array) which will tolerate some failures, down to a given threshold.

It is assumed that as soon as a failure occurs, a repair cycle will be
started.  There is then a race to see if the repair can be completed
before enough additional nodes fail to drop the working number below
K.  If that race is lost, it is assumed that information has been lost,
in the worst case, but at least that the service represented by the N nodes
is not available.

These equations do not give system availability (probability that a service
is available) but rather MTTF.  Availability equations can be found in
several good textbooks.  See, for example, Trivedi's text on Statistics.
MTTF is more difficult to compute.

The equations below were each a symbolic solution to a custom Markov
chain, built to model the choice of N and K.


N: number of nodes in a full system

K: number of nodes in a minimally functional system

$\lambda$: rate of failures (e.g., number of node failures per year)

$\mu$: rate of node repair (in the same units as $\lambda$)

Each fraction below is the MTTF of the whole system:  the mean time
until a system drops to only (K-1) working nodes.

N =  3 ;  K =  2

\begin{equation}
\frac{ 5\lambda + \mu }{ 6\lambda^2 }
\end{equation}

N =  4 ;  K =  2

\begin{equation}
\frac{ 26\lambda^2 + 6\lambda\mu + \mu^2 }{ 24\lambda^3 }
\end{equation}

N =  4 ;  K =  3

\begin{equation}
\frac{ 7\lambda + \mu }{ 12\lambda^2 }
\end{equation}

N =  5 ;  K =  2

\begin{equation}
\frac{ 154\lambda^3 + 36\lambda^2\mu + 7\lambda\mu^2 + \mu^3 }{ 120\lambda^4 }
\end{equation}

N =  5 ;  K =  3

\begin{equation}
\frac{ 47\lambda^2 + 8\lambda\mu + \mu^2 }{ 60\lambda^3 }
\end{equation}

N =  5 ;  K =  4

\begin{equation}
\frac{ 9\lambda + \mu }{ 20\lambda^2 }
\end{equation}

N =  6 ;  K =  2

\begin{equation}
\frac{ 1044\lambda^4 + 240\lambda^3\mu + 48\lambda^2\mu^2 + 8\lambda\mu^3 + \mu^4 }{ 720\lambda^5 }
\end{equation}

N =  6 ;  K =  3

\begin{equation}
\frac{ 342\lambda^3 + 60\lambda^2\mu + 9\lambda\mu^2 + \mu^3 }{ 360\lambda^4 }
\end{equation}

N =  6 ;  K =  4

\begin{equation}
\frac{ 74\lambda^2 + 10\lambda\mu + \mu^2 }{ 120\lambda^3 }
\end{equation}

N =  6 ;  K =  5

\begin{equation}
\frac{ 11\lambda + \mu }{ 30\lambda^2 }
\end{equation}

N =  7 ;  K =  2

\begin{equation}
\frac{ 8028\lambda^5 + 1800\lambda^4\mu + 360\lambda^3\mu^2 + 62\lambda^2\mu^3 + 9\lambda\mu^4 + \mu^5 }{ 5040\lambda^6 }
\end{equation}

N =  7 ;  K =  3

\begin{equation}
\frac{ 2754\lambda^4 + 480\lambda^3\mu + 75\lambda^2\mu^2 + 10\lambda\mu^3 + \mu^4 }{ 2520\lambda^5 }
\end{equation}

N =  7 ;  K =  4

\begin{equation}
\frac{ 638\lambda^3 + 90\lambda^2\mu + 11\lambda\mu^2 + \mu^3 }{ 840\lambda^4 }
\end{equation}

N =  7 ;  K =  5

\begin{equation}
\frac{ 107\lambda^2 + 12\lambda\mu + \mu^2 }{ 210\lambda^3 }
\end{equation}

N =  7 ;  K =  6

\begin{equation}
\frac{ 13\lambda + \mu }{ 42\lambda^2 }
\end{equation}

N =  8 ;  K =  2

\begin{equation}
\frac{ 69264\lambda^6 + 15120\lambda^5\mu + 3000\lambda^4\mu^2 + 520\lambda^3\mu^3 + 78\lambda^2\mu^4 + 10\lambda\mu^5
+ \mu^6 }{ 40320\lambda^7 }
\end{equation}

N =  8 ;  K =  3

\begin{equation}
\frac{ 24552\lambda^5 + 4200\lambda^4\mu + 660\lambda^3\mu^2 + 92\lambda^2\mu^3 + 11\lambda\mu^4 + \mu^5 }{ 20160\lambda^6 }
\end{equation}

N =  8 ;  K =  4

\begin{equation}
\frac{ 5944\lambda^4 + 840\lambda^3\mu + 108\lambda^2\mu^2 + 12\lambda\mu^3 + \mu^4 }{ 6720\lambda^5 }
\end{equation}

N =  8 ;  K =  5

\begin{equation}
\frac{ 1066\lambda^3 + 126\lambda^2\mu + 13\lambda\mu^2 + \mu^3 }{ 1680\lambda^4 }
\end{equation}

N =  8 ;  K =  6

\begin{equation}
\frac{ 146\lambda^2 + 14\lambda\mu + \mu^2 }{ 336\lambda^3 }
\end{equation}

N =  8 ;  K =  7

\begin{equation}
\frac{ 15\lambda + \mu }{ 56\lambda^2 }
\end{equation}

N =  9 ;  K =  3

\begin{equation}
\frac{ 241128\lambda^6 + 40320\lambda^5\mu + 6300\lambda^4\mu^2 + 888\lambda^3\mu^3 + 111\lambda^2\mu^4
+ 12\lambda\mu^5 + \mu^6 }{ 181440\lambda^7 }
\end{equation}

N =  9 ;  K =  4

\begin{equation}
\frac{ 60216\lambda^5 + 8400\lambda^4\mu + 1092\lambda^3\mu^2 + 128\lambda^2\mu^3 + 13\lambda\mu^4 + \mu^5 }{ 60480\lambda^6 }
\end{equation}

N =  9 ;  K =  5

\begin{equation}
\frac{ 11274\lambda^4 + 1344\lambda^3\mu + 147\lambda^2\mu^2 + 14\lambda\mu^3 + \mu^4 }{ 15120\lambda^5 }
\end{equation}

N =  9 ;  K =  6

\begin{equation}
\frac{ 1650\lambda^3 + 168\lambda^2\mu + 15\lambda\mu^2 + \mu^3 }{ 3024\lambda^4 }
\end{equation}

N =  9 ;  K =  7

\begin{equation}
\frac{ 191\lambda^2 + 16\lambda\mu + \mu^2 }{ 504\lambda^3 }
\end{equation}

N =  9 ;  K =  8

\begin{equation}
\frac{ 17\lambda + \mu }{ 72\lambda^2 }
\end{equation}

N = 10 ;  K =  4

\begin{equation}
\frac{ 662640\lambda^6 + 90720\lambda^5\mu + 11760\lambda^4\mu^2 + 1400\lambda^3\mu^3 + 150\lambda^2\mu^4
+ 14\lambda\mu^5 + \mu^6 }{ 604800\lambda^7 }
\end{equation}

N = 10 ;  K =  5

\begin{equation}
\frac{ 127860\lambda^5 + 15120\lambda^4\mu + 1680\lambda^3\mu^2 + 170\lambda^2\mu^3 + 15\lambda\mu^4 + \mu^5 }{ 151200\lambda^6 }
\end{equation}

N = 10 ;  K =  6

\begin{equation}
\frac{ 19524\lambda^4 + 2016\lambda^3\mu + 192\lambda^2\mu^2 + 16\lambda\mu^3 + \mu^4 }{ 30240\lambda^5 }
\end{equation}

N = 10 ;  K =  7

\begin{equation}
\frac{ 2414\lambda^3 + 216\lambda^2\mu + 17\lambda\mu^2 + \mu^3 }{ 5040\lambda^4 }
\end{equation}

N = 10 ;  K =  8

\begin{equation}
\frac{ 242\lambda^2 + 18\lambda\mu + \mu^2 }{ 720\lambda^3 }
\end{equation}

N = 10 ;  K =  9

\begin{equation}
\frac{ 19\lambda + \mu }{ 90\lambda^2 }
\end{equation}

\end{document}