James, thanks for the contribution! Tests and everything. :-)

Jacob sent some mail to the list recently (attached) that details a protocol that he's used successfully (and picked up by some zk users). I have a todo item to document this protocol on the recipes wiki page, haven't gotten to it yet. Not sure how/if this matches what you've done but we should sync up (also see below).

There has been some discussion on client side helper code in the past however this is the first contribution. We need to make some decisions and outline what/how we will accept.

1) I think we should have a "contrib/recipes/{java/{main,test}/org/apache/zookeeper/... ,c/,...}" hierarchy for contributions that implement recipes, including any helper code

2) We should first document recipes on the wiki, then implement them in the code
The code should fully document the api/implementation, and refer to wiki page for protocol specifics.

3) What should we do relative to ZK releases. Are recipes included in a release? Will bugs in recipes hold up a release?

My initial thought is that contrib is available through svn, but not included in the release. If users want to access/use this code they will be required to checkout/build themselves. (at least initially)

4) We will not require "parody" btw the various client languages. Currently we support Java/C clients, we will be adding various scripting languages soon. Contributions will be submitted for various clients (James' submission is for java), that will be placed into contrib, if someone else contributes C bindings (etc...) we will add those to contrib/recipes as well.

5) Implementations should strive to implement similar recipe protocols (see 2 above, a good reason to document before implement). There may be multiple, different, protocols each with their own implementation, but for a particular protocol the implementations should be the same.

We may want to stress 5 even more - if multiple clients implementations (c/java/...) are participating in a single instance of leader election it will be CRITICAL for them to be inter-operable.

Comments, questions, suggestion?


James Strachan wrote:
So having recently discovered ZooKeeper, I'm really liking it - good job folks!

I've seen discussions of building high level features from the core ZK
library and had not seen any available on the interweb so figured I'd
have a try creating a simple one. Feel free to ignore it if a ZK ninja
can think of a neater way of doing it - I've basically followed the
protocol defined in the recent ZK presentation...

I've submitted the code as a patch here...

I figured the Java Client might as well come with some helper code to
make doing things like exclusive locks or leader elections easier; we
could always spin them out into a separate library if and when
required etc. Right now its one fairly simple class :)

Currently its a simple class where you can register a Runnable to be
invoked when you have the lock; or you can just keep asking if you
have the lock now and again as you see fit etc.

WriteLock locker = new WriteLock(zookeeper, "/foo/bar");
locker.setWhenOwner(new Runnable() {...}); // fire this code when owner...

// lets try own it

// I may or may not have the lock now
if (locker.isOwner()) {....}

// time passes


--- Begin Message ---


The following protocol will help you fix the observed misbehavior. As
Flavio points out, you cannot rely on the order of nodes in getChildren,
you must use an intrinsic property of each node to determine who is the
leader. The protocol devised by Runping Qi and described here will do


First of all, when you create child nodes of the node that holds the
leadership bids, you must create them with the EPHEMERAL and SEQUENCE
flag. ZooKeeper guarantees to give you an ephemeral node named uniquely
and with a sequence number larger by at least one than any previously
created node in the sequence. You provide a prefix, like "L_" or your
own choice, and ZooKeeper creates nodes named "L_23", "L_24", etc. The
sequence number starts at 0 and increases monotonously.


Once you've placed your leadership bid, you search backwards from the
sequence number of *your* node to see if there are any preceding (in
terms of the sequence number) nodes. When you find one, you place a
watch on it and wait for it to disappear. When you get the watch
notification, you search again, until you do not find a preceding node,
then you know you're the leader. This protocol guarantees that there is
at any time only one node that thinks it is the leader. But it does not
disseminate information about who is the leader. If you want everyone to
know who is the leader, you can have an additional Znode whose value is
the name of the current leader (or some identifying information on how
to contact the leader, etc.). Note that this cannot be done atomically,
so by the time other nodes find out who the leader is, the leadership
may already have passed on to a different node.




Might it make sense to provide a standardized implementation of leader
election in the library code in Java?





[mailto:[EMAIL PROTECTED] On Behalf Of
Flavio Junqueira
Sent: Friday, July 11, 2008 1:02 AM
Subject: Re: [Zookeeper-user] Leader election


Hi Avinash, getChildren returns a list in lexicographic order, so if you
are updating the children of the election node concurrently, then you
may get a different first node with different clients. If you are using
the sequence flag to create nodes, then you may consider stripping the
prefix of the node name and using the sufix value to determine order. 

Hope it helps.



----- Original Message ----
From: Avinash Lakshman <[EMAIL PROTECTED]>
Sent: Friday, July 11, 2008 7:20:06 AM
Subject: [Zookeeper-user] Leader election


I am trying to elect leader among 50 nodes. There is always one odd guy
who seems to think that someone else distinct from what some other nodes
see as leader. Could someone please tell me what is wrong with the
following code for leader election:

public void electLeader()
            ZooKeeper zk =
            String path = "/Leader";
                String createPath = path + "/L-";

                while( true )
                    /* Get all znodes under the Leader znode */
                    List<String> values = zk.getChildren(path, false);
                     * Get the first znode and if it is the 
                     * pathCreated created above then the data
                     * in that znode is the leader's identity. 
                    if ( leader_ == null )
                        leader_ = new AtomicReference<EndPoint>(
EndPoint.fromBytes( zk.getData(path + "/" + values.get(0), false, null)
) );
                        leader_.set( EndPoint.fromBytes( zk.getData(path
+ "/" + values .get(0), false, null) ) );
                        /* Disseminate the state as to who the leader
is. */
                    logger_.debug("Elected leader is " + leader_ + " @
znode " + ( path + "/" + values.get(0) ) );                   
                    /* We need only the last portion of this znode */
                    String[] peices = pathCreated_.split("/");
                    int index = Collections.binarySearch(values,
peices[peices.length - 1]);                   
                    if ( index > 0 )
                        String pathToCheck = path + "/" +
values.get(index - 1);
                        Stat stat = zk.exists(pathToCheck, true);
                        if ( stat != null )
                            logger_.debug("Awaiting my turn ...");
                            logger_.debug("Checking to see if leader is
around ...");
            catch ( InterruptedException ex )
            catch ( KeeperException ex )



Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
Zookeeper-user mailing list

--- End Message ---

Reply via email to