Hi Austin, Did you kill the leader process? It looks like that you didn't kill the server since its responding to ruok. Is that true?
mahadev On 9/2/08 9:56 AM, "Austin Shoemaker" <[EMAIL PROTECTED]> wrote: > Hi, > > We have run into a situation where killing the leader results in followers > perpetually trying to reelect that leader. > > We have 11 zookeeper (2.2.1 from SF.net) servers and 256 clients connecting > at random. We kill the leader and observe the impact, monitoring a script > that repeatedly prints the responses to "ruok" and "stat". All servers > except the killed leader respond with "imok" and "ZooKeeperServer not > running", respectively. > > About half of the time, each remaining server gets into a loop of failing to > connect to the killed leader and then reelecting the killed leader. > > Here is an example log, which is representative of similar logs on the other > servers. We additionally logged connectivity during leader election. If > anyone would like complete logs, let me know. > > Thanks, > > Austin Shoemaker > > WARN - [QuorumPeer:[EMAIL PROTECTED] - FOLLOWING > *WARN - [QuorumPeer:[EMAIL PROTECTED] - Following /10.50.65.22:2889* > ERROR - [QuorumPeer:[EMAIL PROTECTED] - FIXMSG > java.net.ConnectException: Connection refused > * > .... cont'd ....* > > ERROR - [QuorumPeer:[EMAIL PROTECTED] - FIXMSG > java.lang.Exception: shutdown Follower > at > com.yahoo.zookeeper.server.quorum.Follower.shutdown(Follower.java:364) > at > com.yahoo.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:403) > WARN - [QuorumPeer:[EMAIL PROTECTED] - LOOKING > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to / > 10.50.65.22:2888 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Received response from / > 10.50.65.22:2888 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to / > 10.50.65.21:2888 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Received response from / > 10.50.65.21:2888 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to / > 10.50.65.12:2888 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Received response from / > 10.50.65.12:2888 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to / > 10.50.65.11:2888 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Received response from / > 10.50.65.11:2888 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to / > 10.50.65.12:2890 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Received response from / > 10.50.65.12:2890 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to / > 10.50.65.11:2890 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Received response from / > 10.50.65.11:2890 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to / > 10.50.65.22:2889 > *WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Exception occurred when > sending / receiving packet to / from /10.50.65.22:2889 > java.net.SocketTimeoutException: Receive timed out > *WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to > /10.50.65.21:2890 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Received response from / > 10.50.65.21:2890 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to / > 10.50.65.21:2889 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Received response from / > 10.50.65.21:2889 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to / > 10.50.65.12:2889 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Received response from / > 10.50.65.12:2889 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Sending election packet to / > 10.50.65.11:2889 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Received response from / > 10.50.65.11:2889 > WARN - [QuorumPeer:[EMAIL PROTECTED] - Election tally: > WARN - [QuorumPeer:[EMAIL PROTECTED] - 8 -> 1 > WARN - [QuorumPeer:[EMAIL PROTECTED] - 4 -> 1 > WARN - [QuorumPeer:[EMAIL PROTECTED] - 7 -> 8 > WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Election complete, > result.winner = 7 > *WARN - [QuorumPeer:[EMAIL PROTECTED] - ----> Election complete, address > = /10.50.65.22:2889 > WARN - [QuorumPeer:[EMAIL PROTECTED] - FOLLOWING > WARN - [QuorumPeer:[EMAIL PROTECTED] - Following /10.50.65.22:2889 > ERROR - [QuorumPeer:[EMAIL PROTECTED] - FIXMSG > java.net.ConnectException: Connection refused > * at java.net.PlainSocketImpl.socketConnect(Native Method) > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) > at > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) > at java.net.Socket.connect(Socket.java:519) > at > com.yahoo.zookeeper.server.quorum.Follower.followLeader(Follower.java:133) > at > com.yahoo.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:399)
