Oh my god! You are right, we run an old dev version of 3.2.0:

zookeeper-r785019-hbase-1329.jar

This was what we shipped HBase trunk with last summer... This quorum
has an uptime of more than 6 months! Well I guess that explains it, I
thought we restarted it since then during our HBase upgrades but it
seems not so I'm very sorry about this false alert.

So... all I can say is thank you guys for such a reliable software!
We'll be upgrading to 3.2.2 really soon.

J-D

On Mon, Jan 25, 2010 at 1:44 PM, Patrick Hunt <ph...@apache.org> wrote:
> JD, there's something _very_ unusual in your setup. Are you running
> "official" released ZooKeeper code or something else?
>
> Either there is a misconfiguration on the other servers (the configs for the
> other servers is exactly the same as 222 right?), or perhaps some patches to
> ZK codebase that went awry?
>
> See the attached file "zk_ports.txt". This is a summary of the netstat -a
> that you sent. Notice in particular that UDP sockets are open for port 2888!
> This should not happen in the default ZK configuration case.
>
> By default we only use tcp connections between servers (quorum & election).
> There is a "electionAlg" option that allows users to turn off the TCP based
> fast leader election and go with a UDP based, but I don't see that in the
> config you provided for 222. (as I said, assuming you are not setting this
> option on the other servers either, correct?).
>
>
> Mahadev and I do remember that there was a bug in the 3.2 branch prior to
> 3.2 ever being released that caused us to use non-FLE (so UDP based)
> election by default, however we fixed that before 3.2.0 ever shipped (it was
> a bug in our config processing code) and it was never exposed in an official
> release. Perhaps you have picked up some code prior to that?
>
> Patrick
>
> Jean-Daniel Cryans wrote:
>>>
>>> According to the log for 222 it can't open a connection to the election
>>> port
>>> (3888) for any of the other servers. This seems very unusual. Can you
>>> verify
>>> that ther's connectivity on that port btw 222 and all the other servers?
>>
>> jdcry...@sv4borg222:~$ telnet sv4borg224 3888
>> Trying 10.10.20.224...
>> telnet: Unable to connect to remote host: Connection refused
>> jdcry...@sv4borg222:~$ telnet sv4borg224 2888
>> Trying 10.10.20.224...
>> Connected to sv4borg224.
>> Escape character is '^]'.
>>
>>> Also, can you re-run the netstat with -a option? We can see the listen
>>> sockets that way (omitted by netstat by default). It would be great if
>>> you
>>> could send the netstat for all 5 servers.
>>
>> I updated the tar.gz with the 5 netstat -anp
>>
>> Thx!
>>
>> J-D
>>
>>> Thanks,
>>>
>>> Patrick
>>>
>>> Jean-Daniel Cryans wrote:
>>>>
>>>> Everything is here
>>>> http://people.apache.org/~jdcryans/zk_election_bug.tar.gz
>>>>
>>>> The server we are trying to start is sv4borg222 (myid is 2) and we
>>>> started it around 10:03:21
>>>>
>>>> Thx!
>>>>
>>>> J-D
>>>>
>
> tcp6       0      0 10.10.20.221:34865      10.10.20.224:2888
> ESTABLISHED 14682/java
> udp6       0      0 :::2888                 :::*
>    14682/java
>
>
> tcp6       0      0 :::3888                 :::*                    LISTEN
>    4092/java
> unix  2      [ ]         STREAM     CONNECTED     721588877 7642/java
>
>
> tcp6       0      0 10.10.20.223:42518      10.10.20.224:2888
> ESTABLISHED 2704/java
> udp6       0      0 :::2888                 :::*
>    2704/java
>
>
> tcp6       0      0 :::2888                 :::*                    LISTEN
>    31052/java
> tcp6       0      0 10.10.20.224:2888       10.10.20.223:42518
>  ESTABLISHED 31052/java
> tcp6       0      0 10.10.20.224:2888       10.10.20.225:51459
>  ESTABLISHED 31052/java
> tcp6       0      0 10.10.20.224:2888       10.10.20.221:34865
>  ESTABLISHED 31052/java
> udp6       0      0 :::2888                 :::*
>    31052/java
>
>
> tcp6       0      0 10.10.20.225:51459      10.10.20.224:2888
> ESTABLISHED 19545/java
> udp6       0      0 :::2888                 :::*
>    19545/java
>
>

Reply via email to