I've created a Jira for us to discuss/document this:


I will make 78/79 depend on this

78 is James' patch - may require some rework after 79/80 are finalized
79 is documentation of the leader election protocol that jacob outlined


Benjamin Reed wrote:
Excellent proposal. The only thing I would add is that there should be
an english description of the recipe in subversion. That way if someone
wanted to do a compatible binding they can do it. If the recipe is on
the wiki it would be hard to keep it in sync, so it is important that it
is in subversion. My preference would be that the doc would be in the
same contrib subdirectory as the source for ease of maintenance.


Patrick Hunt wrote:
James, thanks for the contribution! Tests and everything. :-)

Jacob sent some mail to the list recently (attached) that details a
protocol that he's used successfully (and picked up by some zk users).
I have a todo item to document this protocol on the recipes wiki page,
haven't gotten to it yet. Not sure how/if this matches what you've
done but we should sync up (also see below).

There has been some discussion on client side helper code in the past
however this is the first contribution. We need to make some decisions
and outline what/how we will accept.

1) I think we should have a
"contrib/recipes/{java/{main,test}/org/apache/zookeeper/... ,c/,...}"
hierarchy for contributions that implement recipes, including any
helper code

2) We should first document recipes on the wiki, then implement them
in the code
The code should fully document the api/implementation, and refer to
wiki page for protocol specifics.

3) What should we do relative to ZK releases. Are recipes included in
a release? Will bugs in recipes hold up a release?

My initial thought is that contrib is available through svn, but not
included in the release. If users want to access/use this code they
will be required to checkout/build themselves. (at least initially)

4) We will not require "parody" btw the various client languages.
Currently we support Java/C clients, we will be adding various
scripting languages soon. Contributions will be submitted for various
clients (James' submission is for java), that will be placed into
contrib, if someone else contributes C bindings (etc...) we will add
those to contrib/recipes as well.

5) Implementations should strive to implement similar recipe protocols
(see 2 above, a good reason to document before implement). There may
be multiple, different, protocols each with their own implementation,
but for a particular protocol the implementations should be the same.

We may want to stress 5 even more - if multiple clients
implementations (c/java/...) are participating in a single instance of
leader election it will be CRITICAL for them to be inter-operable.

Comments, questions, suggestion?


James Strachan wrote:
So having recently discovered ZooKeeper, I'm really liking it - good
job folks!

I've seen discussions of building high level features from the core ZK
library and had not seen any available on the interweb so figured I'd
have a try creating a simple one. Feel free to ignore it if a ZK ninja
can think of a neater way of doing it - I've basically followed the
protocol defined in the recent ZK presentation...

I've submitted the code as a patch here...

I figured the Java Client might as well come with some helper code to
make doing things like exclusive locks or leader elections easier; we
could always spin them out into a separate library if and when
required etc. Right now its one fairly simple class :)

Currently its a simple class where you can register a Runnable to be
invoked when you have the lock; or you can just keep asking if you
have the lock now and again as you see fit etc.

WriteLock locker = new WriteLock(zookeeper, "/foo/bar");
locker.setWhenOwner(new Runnable() {...}); // fire this code when

// lets try own it

// I may or may not have the lock now
if (locker.isOwner()) {....}

// time passes



Re: [Zookeeper-user] Leader election
"Jacob Levy" <[EMAIL PROTECTED]>
Fri, 11 Jul 2008 10:42:33 -0700
"Flavio Junqueira" <[EMAIL PROTECTED]>,

"Flavio Junqueira" <[EMAIL PROTECTED]>,


The following protocol will help you fix the observed misbehavior. As
Flavio points out, you cannot rely on the order of nodes in
getChildren, you must use an intrinsic property of each node to
determine who is the leader. The protocol devised by Runping Qi and
described here will do that.

First of all, when you create child nodes of the node that holds the
leadership bids, you must create them with the EPHEMERAL and SEQUENCE
flag. ZooKeeper guarantees to give you an ephemeral node named
uniquely and with a sequence number larger by at least one than any
previously created node in the sequence. You provide a prefix, like
"L_" or your own choice, and ZooKeeper creates nodes named "L_23",
"L_24", etc. The sequence number starts at 0 and increases monotonously.

Once you've placed your leadership bid, you search backwards from the
sequence number of **your** node to see if there are any preceding (in
terms of the sequence number) nodes. When you find one, you place a
watch on it and wait for it to disappear. When you get the watch
notification, you search again, until you do not find a preceding
node, then you know you're the leader. This protocol guarantees that
there is at any time only one node that thinks it is the leader. But
it does not disseminate information about who is the leader. If you
want everyone to know who is the leader, you can have an additional
Znode whose value is the name of the current leader (or some
identifying information on how to contact the leader, etc.). Note that
this cannot be done atomically, so by the time other nodes find out
who the leader is, the leadership may already have passed on to a
different node.


Might it make sense to provide a standardized implementation of leader
election in the library code in Java?



[mailto:[EMAIL PROTECTED] *On Behalf Of
*Flavio Junqueira
*Sent:* Friday, July 11, 2008 1:02 AM
*Subject:* Re: [Zookeeper-user] Leader election

Hi Avinash, getChildren returns a list in lexicographic order, so if
you are updating the children of the election node concurrently, then
you may get a different first node with different clients. If you are
using the sequence flag to create nodes, then you may consider
stripping the prefix of the node name and using the sufix value to
determine order.

Hope it helps.


----- Original Message ----
From: Avinash Lakshman <[EMAIL PROTECTED]>
Sent: Friday, July 11, 2008 7:20:06 AM
Subject: [Zookeeper-user] Leader election


I am trying to elect leader among 50 nodes. There is always one odd
guy who seems to think that someone else distinct from what some other
nodes see as leader. Could someone please tell me what is wrong with
the following code for leader election:

public void electLeader()
{ ZooKeeper zk = StorageService.instance().getZooKeeperHandle();
            String path = "/Leader";
                String createPath = path +
"/L-"; LeaderElector.createLock_.lock();
                while( true )
                    /* Get all znodes under the Leader znode */
                    List<String> values = zk.getChildren(path, false);
                     * Get the first znode and if it is the
                     * pathCreated created above then the data
                     * in that znode is the leader's identity.
                    if ( leader_ == null )
                        leader_ = new AtomicReference<EndPoint>(
EndPoint.fromBytes( zk.getData(path + "/" + values.get(0), false,
null) ) );
                        leader_.set( EndPoint.fromBytes(
zk.getData(path + "/" + values .get(0), false, null) ) );
                        /* Disseminate the state as to who the leader
is. */
                    logger_.debug("Elected leader is " + leader_ + " @
znode " + ( path + "/" + values.get(0) ) ); Collections.sort(values);
                    /* We need only the last portion of this znode */
                    String[] peices = pathCreated_.split("/");
                    int index = Collections.binarySearch(values,
peices[peices.length - 1]); if ( index > 0 )
                        String pathToCheck = path + "/" +
values.get(index - 1);
                        Stat stat = zk.exists(pathToCheck, true);
                        if ( stat != null )
                            logger_.debug("Awaiting my turn ...");
                            logger_.debug("Checking to see if leader
is around ...");
            catch ( InterruptedException ex )
            catch ( KeeperException ex )



Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08

Zookeeper-user mailing list

Reply via email to