Re: Zookeeper WAN Configuration

Patrick Hunt Tue, 28 Jul 2009 09:50:40 -0700

Flavio, please enter a doc jira for this if there are no docs, it shouldbe in forrest, not twiki btw. It would be good if you could review thecurrent quorum docs (any type) and create a jira/patch that addressesany/all shortfall.


Patrick


Flavio Junqueira wrote:

Todd, Some more answers. Please check out carefully the information atthe bottom of this message.
On Jul 27, 2009, at 4:02 PM, Todd Greenwood wrote:
I'm assuming that you're setting the weight of ZooKeeper servers in
PODs to zero, which means that their votes when ordering updates do
not count.

[Todd] Correct.

If my assumption is correct, then you should see a significant
improvement in read performance. I would say that write performance
wouldn't be very different from clients in PODs opening a direct
connection to DC.
[Todd] So the Leader, knowing that machine(s) have a voting weight ofzero, doesn't have to wait for their responses in order to form aquorum vote? Does the leader even send voting requests to the weightzero followers?
In the current implementation, it does. When we have observersimplemented, the leader won't do it.
3. ZK Servers within the POD would be resilient to network
connectivity failure between the POD and the DC. Once connectivity
re-established, the ZK Servers in the POD would sync with the ZK
servers in the DC, and, from the perspective of a client within the
POD, everything just worked, and there was no network failure.
We want to have servers switching to read-only mode upon network
partitions, but this is a feature under development. We don't have
plans for implementing any model of eventual consistency that would
allow updates even when not being able to form a quorum, and I
personally believe that it would be a major change, with major
implications not only to the code base, but also to the semantics of
our API.
[Todd] What is the current (3.2) behaviour in the case of a networkfailure that prevents connectivity between ZK Servers in a pod?Assuming the pod is composed of weight=0 followers...are the clientsconnected to these zookeeper servers still able to read? do they getexceptions on write? do the clients hang if it's a synchronous call?
The clients won't be able to read because we don't have this feature ofgoing read-only upon partitions.
4. A WAN topology of co-located ZK servers in both the DC and (n)
PODs would not significantly degrade the performance of the
ensemble, provided large blobs of traffic were not being sent across
the network.
If the zk servers in the PODs are assigned weight zero, then I don't
see a reason for having lower performance in the scenario you
describe. If weights are greater than zero for zk servers in PODs,
then your performance might be affected, but there are ways of
assigning weights that do not require receiving votes from all co-
locations for progress.
[Todd] Great, we'll proceed with hierarchical configuration w/ ZKServers in pods having a voting weight of zero. Could you provide apointer to a configuration that shows this? The docs are a bit lean inthis regard...
We should have a twiki page on this. For now, you can find an example inthe header of QuorumHierarchical.java.
Also, I found a couple of bugs recently that may or may not affect yoursetup, so I suggest that you apply the patches in ZOOKEEPER-481 andZOOKEEPER-479. We would like to have these patches in for the nextrelease (3.2.1), which should be out in two or three weeks, if there isno further complication.
Another issue that I realized that won't work in your case, but the fixwould be relatively easy, is the guarantee that no zero-weight followerwill be elected. Currently, we don't check the weight during leaderelection. I'll open a jira and put up a patch soon.
-Flavio

Re: Zookeeper WAN Configuration

Reply via email to