Everything is here http://people.apache.org/~jdcryans/zk_election_bug.tar.gz
The server we are trying to start is sv4borg222 (myid is 2) and we
started it around 10:03:21
On Mon, Jan 25, 2010 at 10:49 AM, Patrick Hunt <ph...@apache.org> wrote:
> 1) Capture the logs from all 5 servers
> 2) give the config for the "down" server, also indicate that it's server id
> 3) if possible it would be interesting to see the netstat information from 2
> of the servers - the one that's down and one or more of the others.
> Jean-Daniel Cryans wrote:
>> I believe we've just hit the same problem with zk-3.2.1
>> For some reason a machine crashed and it was part of our quorum of 5
>> servers. When we try to restart it it this does this (I replaced
>> hostname and IP):
>> 2010-01-25 10:25:06,469 WARN
>> org.apache.zookeeper.server.quorum.QuorumCnxManager: Cannot open
>> channel to 1 at election address somehost1/someip1:3888
>> java.net.ConnectException: Connection refused
>> at sun.nio.ch.Net.connect(Native Method)
>> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507)
>> at java.nio.channels.SocketChannel.open(SocketChannel.java:146)
>> It has been like that for almost 20 minutes now, trying every other
>> server in the quorum on different channels. ruok says imok but all
>> other commands say that ZK server isn't running. I don't believe that
>> 3.2.2 will help unless ZK-547 does more than it seems to.
>> Any else I should look at?
>> On Wed, Jan 13, 2010 at 11:19 AM, Nick Bailey <ni...@mailtrust.com> wrote:
>>> So the solution for us was to just nuke zookeeper and restart everywhere.
>>> We will also be upgrading soon as well.
>>> To answer your question, yes I believe all the servers were running
>>> except for the fact that they were experiencing high CPU usage. As we
>>> to see some CPU alerts I started restarting some of the servers.
>>> It was then that we noticed that they were not actually running according
>>> I still have the log from one server with a debug level and the rest with
>>> warn level. If you would like to see any of these and analyze them just
>>> me know.
>>> Thanks for the help,
>>> Nick Bailey