JD, there's something _very_ unusual in your setup. Are you running "official" released ZooKeeper code or something else?

Either there is a misconfiguration on the other servers (the configs for the other servers is exactly the same as 222 right?), or perhaps some patches to ZK codebase that went awry?


See the attached file "zk_ports.txt". This is a summary of the netstat -a that you sent. Notice in particular that UDP sockets are open for port 2888! This should not happen in the default ZK configuration case.

By default we only use tcp connections between servers (quorum & election). There is a "electionAlg" option that allows users to turn off the TCP based fast leader election and go with a UDP based, but I don't see that in the config you provided for 222. (as I said, assuming you are not setting this option on the other servers either, correct?).


Mahadev and I do remember that there was a bug in the 3.2 branch prior to 3.2 ever being released that caused us to use non-FLE (so UDP based) election by default, however we fixed that before 3.2.0 ever shipped (it was a bug in our config processing code) and it was never exposed in an official release. Perhaps you have picked up some code prior to that?

Patrick

Jean-Daniel Cryans wrote:
According to the log for 222 it can't open a connection to the election port
(3888) for any of the other servers. This seems very unusual. Can you verify
that ther's connectivity on that port btw 222 and all the other servers?

jdcry...@sv4borg222:~$ telnet sv4borg224 3888
Trying 10.10.20.224...
telnet: Unable to connect to remote host: Connection refused
jdcry...@sv4borg222:~$ telnet sv4borg224 2888
Trying 10.10.20.224...
Connected to sv4borg224.
Escape character is '^]'.

Also, can you re-run the netstat with -a option? We can see the listen
sockets that way (omitted by netstat by default). It would be great if you
could send the netstat for all 5 servers.

I updated the tar.gz with the 5 netstat -anp

Thx!

J-D

Thanks,

Patrick

Jean-Daniel Cryans wrote:
Everything is here
http://people.apache.org/~jdcryans/zk_election_bug.tar.gz

The server we are trying to start is sv4borg222 (myid is 2) and we
started it around 10:03:21

Thx!

J-D

tcp6       0      0 10.10.20.221:34865      10.10.20.224:2888       ESTABLISHED 
14682/java      
udp6       0      0 :::2888                 :::*                                
14682/java      


tcp6       0      0 :::3888                 :::*                    LISTEN      
4092/java       
unix  2      [ ]         STREAM     CONNECTED     721588877 7642/java           


tcp6       0      0 10.10.20.223:42518      10.10.20.224:2888       ESTABLISHED 
2704/java       
udp6       0      0 :::2888                 :::*                                
2704/java       


tcp6       0      0 :::2888                 :::*                    LISTEN      
31052/java      
tcp6       0      0 10.10.20.224:2888       10.10.20.223:42518      ESTABLISHED 
31052/java      
tcp6       0      0 10.10.20.224:2888       10.10.20.225:51459      ESTABLISHED 
31052/java      
tcp6       0      0 10.10.20.224:2888       10.10.20.221:34865      ESTABLISHED 
31052/java      
udp6       0      0 :::2888                 :::*                                
31052/java      


tcp6       0      0 10.10.20.225:51459      10.10.20.224:2888       ESTABLISHED 
19545/java      
udp6       0      0 :::2888                 :::*                                
19545/java      

Reply via email to