Hi Qian,
  This is quite weird. Are you sure the version is 3.2.1?
   If yes, please create a jira for this.

  Also, can you extract the server logs for the session


>>         ephemeralOwner: 226627854640480810

And post it on a jira? Ephemeral Owner is the session id. You can convert
the above number to hex and look through the logs to see what happened to
this session and post the logs on the jira. Looks like the session close for
the session (226627854640480810) wasn't successful (a bug mostly). So we
need to trace back on what happened on a close of this session and why it
did not close.

Grepping all the server logs for session id (0x32524d5440e022a, this is the
hex of the the above decimal number) might give us some insight into this.


Thanks
mahadev

On 12/15/09 7:44 AM, "Benjamin Reed" <br...@yahoo-inc.com> wrote:

> does  se/diserver_tc/diserver_tc0000000067 appear on all three servers?
> 
> ben
> 
> Qian Ye wrote:
>> Hi guys:
>> 
>> I find a very strange scenario today, I'm not sure how it happen, I just
>> found it like this. Maybe you can give me some information about it, my
>> Zookeeper Server is version 3.2.1.
>> 
>> My Zookeeper cluster contains three servers, with ip:
>> 10.81.12.144,10.81.12.145,10.81.12.141. I wrote a client to create ephemeral
>> node under znode: *se/diserver_tc*. The client runs on the server with ip
>> 10.81.13.173. The client can create a ephemeral node on zookeeper server and
>> write the host ip (10.81.13.173) in to the node as its data. There is only
>> one client process can be running at a time, because the client will listen
>> to a certain port.
>> 
>> It is strange that I found there were two ephemeral node with the ip
>> 10.81.13.173 under znode se/diserver_tc.
>> *se/diserver_tc/diserver_tc0000000067*
>> STAT:
>>         czxid: 124554079820
>>         mzxid: 124554079820
>>         ctime: 1260609598547
>>         mtime: 1260609598547
>>         version: 0
>>         cversion: 0
>>         aversion: 0
>>         ephemeralOwner: 226627854640480810
>>         dataLength: 92
>>         numChildren: 0
>>         pzxid: 124554079820
>> 
>> *se/diserver_tc/diserver_tc0000000095
>> *STAT:
>>         czxid: 128849019107
>>         mzxid: 128849019107
>>         ctime: 1260772197356
>>         mtime: 1260772197356
>>         version: 0
>>         cversion: 0
>>         aversion: 0
>>         ephemeralOwner: 154673159808876591
>>         dataLength: 92
>>         numChildren: 0
>>         pzxid: 128849019107*
>> *
>> There are TWO with different session id! And after I kill the client process
>> on the server 10.81.13.173, the *se/diserver_tc/diserver_tc0000000095 *node
>> disappear, but the *se/diserver_tc/diserver_tc0000000067 *stay the same.
>> That means it is not my coding mistake to create the node twice. I checked
>> several times and I'm sure that there is no another client instance running.
>> And I use the 'stat' command to check the three zookeeper servers, and there
>> is no client from 10.81.13.173,
>> 
>> $echo stat | nc 10.81.12.144 2181
>> Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT
>> Clients:
>>  /10.81.13.173:35676[1](queued=0,recved=0,sent=0) *# it is caused by the nc
>> process*
>> 
>> Latency min/avg/max: 0/3/254
>> Received: 11081
>> Sent: 0
>> Outstanding: 0
>> Zxid: 0x1e000001f5
>> Mode: follower
>> *Node count: 32
>> *
>> $ echo stat | nc 10.81.12.141 2181
>> Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT
>> Clients:
>>  /10.81.12.152:58110[1](queued=0,recved=10374,sent=0)
>>  /10.81.13.173:35677[1](queued=0,recved=0,sent=0) *# it is caused by the nc
>> process*
>> 
>> Latency min/avg/max: 0/0/37
>> Received: 37128
>> Sent: 0
>> Outstanding: 0
>> Zxid: 0x1e000001f5
>> Mode: follower
>> *Node count: 26*
>> 
>> $ echo stat | nc 10.81.12.145 2181
>> Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT
>> Clients:
>>  /10.81.12.153:19130[1](queued=0,recved=10624,sent=0)
>>  /10.81.13.173:35678[1](queued=0,recved=0,sent=0) *# it is caused by the nc
>> process*
>> 
>> Latency min/avg/max: 0/2/213
>> Received: 26700
>> Sent: 0
>> Outstanding: 0
>> Zxid: 0x1e000001f5
>> Mode: leader
>> *Node count: 26*
>> 
>> The three 'stat' commands show different Node count! Just cannot understand
>> how it happened, can anyone give me some explanation about it?
>> 
>> 
>>   
> 

Reply via email to