Suppose a machine has probability of soft-failure p_1 and catastrophic p_2
<< p_1. Assume that two machines have independent failure modes.
Probably of soft failure of a one machine cluster = p_1, two machine cluster
= probability of soft failure of 1 or 2 machines + probability of one
machine hard failure = 1-(1-p_1)^2 + 2 p_2 (1-p_1-p_2) \approx 2 p_1
Probability of hard failure of one machine cluster = p_2, two machine
cluster = probability of double hard failure = p_2 ^ 2
So I should have said that the probability of soft failures increases
significantly and the probability of hard failures drops dramatically.
Standard measures can be used to decrease the single machine catastrophic
failure rate substantially. The net is higher failure rate for the two
On the other hand, the soft failure rate for a three machine cluster is
roughly p_1^2 and the hard failure rate is about p_2^3. These are
dramatically better than either the one or two machine case.
On Wed, Mar 31, 2010 at 1:06 PM, David Rosenstrauch <dar...@darose.net>wrote:
> Using two machines running ZK will actually decrease your reliability
>> compared to using a single machine. Consider using one machine or three.
> Not meaning to pull the thread off-topic, but I don't understand why this
> should be the case. Can you elaborate?