On Jul 6, 2009, at 15:40 , Henry Robinson wrote:
This is an interesting way of doing things. It seems like there is a
correctness issue: if a majority of servers fail, with the remaining
minority lagging the leader for some reason, won't the ensemble's current state be forever lost? This is akin to a majority of servers failing and never recovering. ZK relies on the eventual liveness of a majority of its
servers; with EC2 it seems possible that that property might not be
satisfied.


I think you are absolutely correct. However, my understanding of EC2 failure modes is that even though there is no guarantee that a particular instance's disk will survive a failure, it is very possible to observe EC2 nodes that "fail" temporarily (such as rebooting). In these cases, the instance's disk typically does survive, and when it comes back it will have the same contents. It is only "permanent" EC2 failures where the disk is gone (eg. hardware failure, or Amazon decides to pull it for some other reason).

Thus, this looks a lot like running your own machines in your own data center to me. Soft failures will recover, hardware failures won't. The only difference is that if you were running the machines yourself, and you ran into some weird issue where you had hardware failures across a majority of your Zookeeper ensemble, you could physically move the disks to recover the state. If this happens in EC2, you will have to do some sort of "manual" repair where you forcibly restart Zookeeper using the state of one of the surviving members. Some Zookeeper operations may be lost in this case.

However, we are talking about a situation that seems exceedingly rare. No matter what kind of system you are running, serious non-recoverable failures will happen, so I don't see this to be an impediment for running Zookeeper or other quorum systems in EC2.

That said, I haven't run enough EC2 instances for a long enough period of time to observe any serious failures or recoveries. If anyone has more detailed information, I would love to hear about it.

Evan Jones

--
Evan Jones
http://evanjones.ca/

Reply via email to