how big is your data? you may be running into the problem where it takes too long to do the state transfer and times out. check the initLimit and the size of your data.

ben

On 10/10/2010 08:57 AM, Avinash Lakshman wrote:
Thanks Ben. I am not mixing processes of different clusters. I just double
checked that. I have ZK deployed in a 5 node cluster and I have 20
observers. I just started the 5 node cluster w/o starting the observers. I
still the same issue. Now my cluster won't start up. So what is the correct
workaround to get this going? How can I find out who the leader is and who
the follower to get more insight?

Thanks
A

On Sun, Oct 10, 2010 at 8:33 AM, Benjamin Reed<br...@yahoo-inc.com>  wrote:

this usually happens when a follower closes its connection to the leader.
it is usually caused by the follower shutting down or failing. you may get
further insight by looking at the follower logs. you should really run with
timestamps on so that you can correlate the logs of the leader and follower.

on thing that is strange is the wide divergence between zxid of follower
and leader. are you mixing processes of different clusters?

ben

________________________________________
From: Avinash Lakshman [avinash.laksh...@gmail.com]
Sent: Sunday, October 10, 2010 8:18 AM
To: zookeeper-user
Subject: What does this mean?

I see this exception and the servers not doing anything.

java.io.IOException: Channel eof
        at

org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:630)
ERROR - 124554051584(higestZxid)>  21477836646(next log) for type -11
WARN - Sending snapshot last zxid of peer is 0xe00000000  zxid of leader is
0x1e00000000
WARN - Sending snapshot last zxid of peer is 0x1800000000  zxid of leader
is
0x1e00000000g
  WARN - Sending snapshot last zxid of peer is 0x5002dc766  zxid of leader
is
0x1e00000000
WARN - Sending snapshot last zxid of peer is 0x1c00000000  zxid of leader
is
0x1e00000000
ERROR - Unexpected exception causing shutdown while sock still open
java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:78)
        at java.io.DataOutputStream.writeInt(DataOutputStream.java:180)
        at
org.apache.jute.BinaryOutputArchive.writeInt(BinaryOutputArchive.java:55)
        at
org.apache.zookeeper.data.StatPersisted.serialize(StatPersisted.java:116)
        at org.apache.zookeeper.server.DataNode.serialize(DataNode.java:167)
        at

org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123)
        at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:967)
        at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:982)
        at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:982)
        at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:982)
        at
org.apache.zookeeper.server.DataTree.serialize(DataTree.java:1031)
        at

org.apache.zookeeper.server.util.SerializeUtils.serializeSnapshot(SerializeUtils.java:104)
        at

org.apache.zookeeper.server.ZKDatabase.serializeSnapshot(ZKDatabase.java:426)
        at

org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:331)
WARN - ******* GOODBYE /10.138.34.212:33272 ********

Avinash


Reply via email to