Michael Bauland wrote:
- When I connect with a client to the Zookeeper ensemble I provide the
three IP addresses of my three Zookeeper servers. Does the client then
choose one of them arbitrarily or will it always try to connect to the
first one first? I'm asking since I would like to have my clients first
try to connect to the local Zookeeper server and only if that fails (for
whatever reason, maybe it's down) it should try to connect to one of the
servers on a different physical location.
It explicitly randomizes the list, this is to eliminate all clients
connecting to the same servers in the same order, rather it distributes
There's an option to turn this off in the c client, but not in the java
client. However this is intended for testing purposes, not for
production use. We could add it to the java client (create a JIRA if you
like) however I'm not sure it would solve your problem. Once the local
server is accessible again there'd be nothing to cause the client to
connect to the local server. If the issue is intermittent it could
An option is to have more than one local server. So if you distrib btw 3
colos then use 7 servers (3-2-2) and have the client connect to the 2
local. This handles local failure, however it does not handle
partitioning of the local servers from the remote ensemble members.
However if both local servers are unable to connect to remote servers it
seems likely that the client couldn't either (network partition). Not an
optimal solution either unfortunately.
We have talked about adding this functionality to the ZK client (connect
to the server with lowest latency/load/connections/etc... first).
However this is not currently implemented. There's also the issue of how
to decide when to switch to another server (ie when local comes back).
For the time being you may have to handle this within your own code (two
possible sessions based on connect string).
You'll want to increase from the defaults since those are typically for
high performance interconnect (ie within colo). You are correct though,
much will depend on your env. and some tuning will be involved.
Do you have any suggestions for the parameters? So far I left tickTime
at 2 sec and increased initLimit and syncLimit to 30 (i.e., one minute).
Our sites are connected with 1Gbit to the Internet, but of course we
have no influence on what's in between. The data managed by zookeeper is
quite large (snapshots are 700 MByte, but they may increase in the future).
What's you latency look like? Try using something like ping btw your
colos over a longish period of time. Then look at the min/max/avg
results that you see. What are you seeing? That along with bandwidth
measurement (copy files using scp say) will help you decide.
We'd definitely be interested in your feedback as part of this process.
Both in terms of docs (ff to enter jiras) and other insights you have as
part of the effort.