Todd, I don't think a feature right now that allows you to do exactly what you're requesting. However, we have been working on a couple of features that might give you what you want:

1- Hierarchical quorums: this feature allows you to split servers into groups (perhaps mapping groups to computing centers), and to assign weights to servers. I believe we could constrain leader servers to be the ones with weight greater than zero; 2- Observers: you could have one computing center containing an ensemble and observers around the edge just learning committed values.

Does any of these help in your case?


On Jul 18, 2009, at 12:16 AM, Todd Greenwood wrote:


I'm configuring a zookeeper ensemble that will have servers across a
WAN. However, I'd like to constrain the leader elections to elected
leaders only from a pool of zookeeper servers in a centralized computing
center. In this scenario, zookeeper servers at the edge of the WAN can
be members of the ensemble, but not leaders. I anticipate that this will
perform significantly better than forcing traffic from one edge of the
WAN to another, for the case of a leader elected to a WAN edge node.

[POD (1..N)]* <--> [ DC ]**

* contains multiple zk servers, but no leaders
** contains multiple zk servers, each may be a leader

Are there a configuration parameters that will accomplish this, or do I
need to patch ZK?

I'm a bit confused by the 3.2.0 admin documentation in this respect:


   (Java system property: zookeeper.leaderServes)

   Leader accepts client connections. Default value is "yes". The
leader machine coordinates updates. For higher update throughput at thes
slight expense of read throughput the leader can be configured to not
accept clients and focus on coordination. The default to this option is
yes, which means that a leader will accept client connections.

Turning on leader selection is highly recommended when you have more
than three ZooKeeper servers in an ensemble.


Q: The leaderServers property seems to be an optimization that favors
updates at the expense of reads. Fine. But the note below that doesn't
make sense to me... What does it mean to turn on leader selection? Does
this mean to configure leaderServes="yes"? Is serving the same as
selecting? Perhaps there is a typo here...

Q: The next items on this web page mention configuring groups and
assigning a weight to the votes of machines in those groups. If I am
defining an ensemble with groups of machines that I do not want elected
as leaders, would I assign them to a group and set their voting weight
to zero? Is that the expected practice?


   (No Java system property)

Enables a hierarchical quorum construction."x" is a group identifier and the numbers following the "=" sign correspond to server identifiers. The left-hand side of the assignment is a colon-separated list of server
identifiers. Note that groups must be disjoint and the union of all
groups must be the ZooKeeper ensemble.

   (No Java system property)

   Used along with "group", it assigns a weight to a server when
forming quorums. Such a value corresponds to the weight of a server when
voting. There are a few parts of ZooKeeper that require voting such as
leader election and the atomic broadcast protocol. By default the weight
of server is 1. If the configuration defines groups, but not weights,
then a value of 1 will be assigned to all servers.


Reply via email to