On Tue, Aug 19, 2008 at 3:24 AM, Flavio Junqueira <[EMAIL PROTECTED]> wrote: > Anthony, Could you tell me how you're starting up the servers? Everything > works fine in my setting, so I can't reproduce it. I'm starting up one > server at a time, and my config is very similar to yours:
I run this command on all three servers at the same time with cssh: java -cp /home/anthonyu/lib/zookeeper/trunk/zookeeper-3.0.0.jar:/home/anthonyu/lib/log4j-1.2.15.jar:/home/anthonyu/lib/zookeeper/trunk/conf org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg & > > clientPort=2181 > quorumPort=1111 > electionPort=1112 > tickTime=2000 > initLimit=5 > syncLimit=5 > dataDir=/tmp/zookeeper > server.1=xxx1:11111 > server.2=xxx2:11111 > server.3=xxx3:11111 I do not have the quorumPort and electionPort attributes in my config file, are there defaults? I will try them next time I need to bring the cluster down. > > In any case, there seems to be a race condition in QuorumCnxManager, which > I'll investigate. > > Thanks, > -Flavio > > >> -----Original Message----- >> From: Anthony Urso [mailto:[EMAIL PROTECTED] >> Sent: Tuesday, August 19, 2008 3:49 AM >> To: zookeeper-dev@hadoop.apache.org >> Subject: Fast leader election algorithm throws NPE and hangs >> >> I updated trunk to current to get the diff for ZOOKEEPER-122, and I >> stopped being able to run my dev zookeeper cluster in distributed >> mode. In order to get it running again, I had to specify the election >> algorithm to be 0. >> >> One of the servers gets this NPE: >> >> Exception in thread "Thread-2" java.lang.NullPointerException >> at >> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumC >> nxManager.java:518) >> >> The rest just hang while running an election: >> >> zoo.log: >> 2008-08-18 18:31:26,519 - INFO [QuorumPeer:[EMAIL PROTECTED] - LOOKING >> 2008-08-18 18:31:26,537 - WARN [QuorumPeer:[EMAIL PROTECTED] - >> Election tally: 0 >> >> command line: >> java -cp /home/anthonyu/lib/zookeeper/trunk/zookeeper- >> 3.0.0.jar:/home/anthonyu/lib/log4j- >> 1.2.15.jar:/home/anthonyu/lib/zookeeper/trunk/conf >> org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg & >> >> original zoo.cfg: >> tickTime=2000 >> dataDir=/home/anthonyu/zookeeper >> clientPort=2181 >> initLimit=5 >> syncLimit=2 >> server.1=zoo1:2182 >> server.2=zoo2:2182 >> server.3=zoo3:2182 >> >> I don't know if this is a bug or a misconfiguration. >> >> Cheers, >> Anthony > >