Hi Qian,
 You can just upload the server logs to this jira and we can figure out if
there is something useful that we can use to find out what happened.

Thanks
mahadev

On 12/15/09 6:05 PM, "Qian Ye" <yeqian....@gmail.com> wrote:

> I have opened a jira for the issue:
> https://issues.apache.org/jira/browse/ZOOKEEPER-628
> and my zookeeper version is 3.2.1, the se/diserver_tc/diserver_
> tc0000000067 only appear on the server 10.81.12.144.
> 
> I will attach the additional information to the jira.
> 
> thx
> 
> On Wed, Dec 16, 2009 at 1:57 AM, Patrick Hunt <ph...@apache.org> wrote:
> 
>> You might also try the "dump" command for all 3 servers (similar to the
>> stat command - it's a 4letterword) and look at it's output -- it includes
>> information on ephemeral nodes.
>> 
>> Patrick
>> 
>> 
>> Qian Ye wrote:
>> 
>>> Hi guys:
>>> 
>>> I find a very strange scenario today, I'm not sure how it happen, I just
>>> found it like this. Maybe you can give me some information about it, my
>>> Zookeeper Server is version 3.2.1.
>>> 
>>> My Zookeeper cluster contains three servers, with ip:
>>> 10.81.12.144,10.81.12.145,10.81.12.141. I wrote a client to create
>>> ephemeral
>>> node under znode: *se/diserver_tc*. The client runs on the server with ip
>>> 10.81.13.173. The client can create a ephemeral node on zookeeper server
>>> and
>>> write the host ip (10.81.13.173) in to the node as its data. There is only
>>> one client process can be running at a time, because the client will
>>> listen
>>> to a certain port.
>>> 
>>> It is strange that I found there were two ephemeral node with the ip
>>> 10.81.13.173 under znode se/diserver_tc.
>>> *se/diserver_tc/diserver_tc0000000067*
>>> STAT:
>>>        czxid: 124554079820
>>>        mzxid: 124554079820
>>>        ctime: 1260609598547
>>>        mtime: 1260609598547
>>>        version: 0
>>>        cversion: 0
>>>        aversion: 0
>>>        ephemeralOwner: 226627854640480810
>>>        dataLength: 92
>>>        numChildren: 0
>>>        pzxid: 124554079820
>>> 
>>> *se/diserver_tc/diserver_tc0000000095
>>> *STAT:
>>>        czxid: 128849019107
>>>        mzxid: 128849019107
>>>        ctime: 1260772197356
>>>        mtime: 1260772197356
>>>        version: 0
>>>        cversion: 0
>>>        aversion: 0
>>>        ephemeralOwner: 154673159808876591
>>>        dataLength: 92
>>>        numChildren: 0
>>>        pzxid: 128849019107*
>>> *
>>> There are TWO with different session id! And after I kill the client
>>> process
>>> on the server 10.81.13.173, the *se/diserver_tc/diserver_tc0000000095
>>> *node
>>> disappear, but the *se/diserver_tc/diserver_tc0000000067 *stay the same.
>>> That means it is not my coding mistake to create the node twice. I checked
>>> several times and I'm sure that there is no another client instance
>>> running.
>>> And I use the 'stat' command to check the three zookeeper servers, and
>>> there
>>> is no client from 10.81.13.173,
>>> 
>>> $echo stat | nc 10.81.12.144 2181
>>> Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT
>>> Clients:
>>>  /10.81.13.173:35676[1](queued=0,recved=0,sent=0) *# it is caused by the
>>> nc
>>> process*
>>> 
>>> Latency min/avg/max: 0/3/254
>>> Received: 11081
>>> Sent: 0
>>> Outstanding: 0
>>> Zxid: 0x1e000001f5
>>> Mode: follower
>>> *Node count: 32
>>> *
>>> $ echo stat | nc 10.81.12.141 2181
>>> Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT
>>> Clients:
>>>  /10.81.12.152:58110[1](queued=0,recved=10374,sent=0)
>>>  /10.81.13.173:35677[1](queued=0,recved=0,sent=0) *# it is caused by the
>>> nc
>>> process*
>>> 
>>> Latency min/avg/max: 0/0/37
>>> Received: 37128
>>> Sent: 0
>>> Outstanding: 0
>>> Zxid: 0x1e000001f5
>>> Mode: follower
>>> *Node count: 26*
>>> 
>>> $ echo stat | nc 10.81.12.145 2181
>>> Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT
>>> Clients:
>>>  /10.81.12.153:19130[1](queued=0,recved=10624,sent=0)
>>>  /10.81.13.173:35678[1](queued=0,recved=0,sent=0) *# it is caused by the
>>> nc
>>> process*
>>> 
>>> Latency min/avg/max: 0/2/213
>>> Received: 26700
>>> Sent: 0
>>> Outstanding: 0
>>> Zxid: 0x1e000001f5
>>> Mode: leader
>>> *Node count: 26*
>>> 
>>> The three 'stat' commands show different Node count! Just cannot
>>> understand
>>> how it happened, can anyone give me some explanation about it?
>>> 
>>> 
>>> 
> 

Reply via email to