[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

Flavio Paiva Junqueira (JIRA) Mon, 19 Jul 2010 09:28:15 -0700

    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889906#action_12889906
 ]


Flavio Paiva Junqueira commented on ZOOKEEPER-822:
--------------------------------------------------

Vishal, I can't reproduce your problem. I just tried twice to kill the leader 
and rejoin it 20 times each, and I can't see the problem you're mentioning.  I 
wonder if there is anything special about your setup. I also can see in your 
logs lots of exceptions related to connections, and as a first cut, it sounds 
like this is preventing the severs from exchanging notifications, and therefore 
the delay. 

Two minor comments: your log file for server 2 does not contain "START HERE" 
and each file duplicates every message.



> Leader election taking a long time  to complete
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-822
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.0
>            Reporter: Vishal K
>            Priority: Blocker
>         Attachments: test_zookeeper_1.log, test_zookeeper_2.log
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

Reply via email to