Thanks, Flavio, I appreciate your feedback.
Three power sources obviously would solve the problem. Unfortunately at this moment it does not seem to be feasible (we will need to rebuild the whole existing infrastructure). This is the main reason why I am exploring possible alternative (besides that ZK ideally fits our needs). EC2 is also, even theoretically possible, but as you noted, very shaky solution at best. The other possibility I see that might work is to dynamically adjust quorum rules in case of failure detected. Let's say if we detected failure of the half of the servers (manually or automatically) we can notify alive nodes to adjust quorum policy by excluding dead nodes votes (of course we need to make sure that dead nodes are dead - we can kill processes). Basically it means that we need to reconfigure cluster on the fly. Obviously it also complicates recovery. Any opinion, input, ideas on this approach? Please do not think that I am stubborn with looking for a solution here. The thing that I would hate most is to give up on ZK (which otherwise is ideal for us) just because of these limitations.


On 07/15/2010 12:26 PM, Flavio Junqueira wrote:
Your EC2 suggestion sounds reasonable. If your deployment is able to form a local quorum most of the time, then you would be able to get a quorum of acks most of the time.

One concern is that the EC2 replica might lag behind badly, which may force the leader to either slow down or to drop the connection to the EC2 follower, assuming that EC2 server is not the leader itself.

It might not be a possibility for you, but Ideally, you could have three power sources, and have three sets of servers. We could then tolerate the failure of one power source with the mechanisms we have currently implemented.

-Flavio

On Jul 14, 2010, at 11:16 PM, Sergei Babovich wrote:

Thanks, Flavio,
Yep... I see. This is a problem. Any better idea?
As an alternative option we could probably consider running single ZK
node on EC2 - only in order to handle this specific case. Does it make
sense to you? Is it feasible? Would it result in considerable
performance impact due to network latency? I hope that at least in
theory since quorum can be reached without ack from EC2 node performance
impact might be manageable.

Regards,
Sergei

On 07/14/2010 04:52 PM, Flavio Junqueira wrote:
Hi Sergei, I'm not sure what the implementation of QuorumVerifier you
have in mind would look like to make your setting work. Even if you
don't have partitions, variation in message delays can cause
inconsistencies in your ZooKeeper cluster. Keep in mind that we make
the assumption that quorums intersect.

-Flavio

On Jul 14, 2010, at 9:43 PM, Sergei Babovich wrote:

Hi,
We are currently evaluating use of ZK in our infrastructure. In our
setup we have a set of servers running from two different power feeds.
If one power feed goes away so does half of the servers. This makes
problematic to configure ZK ensemble that would tolerate such outage.
The network partitioning is not an issue in our case. The only solution
I come up with so far is to provide custom QuorumVerifier that will add
a little premium in case if all servers in the quorum set are from the
same group. Basically if we have only half of votes but all of them
belong to the same group then we decide to have a quorum.
Any ideas or better solutions are very appreciated. Sorry if this has
been already discussed/answered.

Regards,
Sergei
This e-mail message and all attachments transmitted with it may
contain privileged and/or confidential information intended solely
for the use of the addressee(s). If the reader of this message is not
the intended recipient, you are hereby notified that any reading,
dissemination, distribution, copying, forwarding or other use of this
message or its attachments is strictly prohibited. If you have
received this message in error, please notify the sender immediately
and delete this message, all attachments and all copies and backups
thereof.


*flavio*
*junqueira*

research scientist

f...@yahoo-inc.com <mailto:f...@yahoo-inc.com>
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300 fax (408) 349 3301





This e-mail message and all attachments transmitted with it may contain privileged and/or confidential information intended solely for the use of the addressee(s). If the reader of this message is not the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, forwarding or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and delete this message, all attachments and all copies and backups thereof.

*flavio*
*junqueira*

research scientist

f...@yahoo-inc.com <mailto:f...@yahoo-inc.com>
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300 fax (408) 349 3301





This e-mail message and all attachments transmitted with it may contain 
privileged and/or confidential information intended solely for the use of the 
addressee(s). If the reader of this message is not the intended recipient, you 
are hereby notified that any reading, dissemination, distribution, copying, 
forwarding or other use of this message or its attachments is strictly 
prohibited. If you have received this message in error, please notify the 
sender immediately and delete this message, all attachments and all copies and 
backups thereof.

Reply via email to