I created a jira for the cli - should output xids/sessionids/etc in hex

https://issues.apache.org/jira/browse/ZOOKEEPER-626

Patrick

Mahadev Konar wrote:
Hi Qian,
  This is quite weird. Are you sure the version is 3.2.1?
   If yes, please create a jira for this.

  Also, can you extract the server logs for the session


        ephemeralOwner: 226627854640480810

And post it on a jira? Ephemeral Owner is the session id. You can convert
the above number to hex and look through the logs to see what happened to
this session and post the logs on the jira. Looks like the session close for
the session (226627854640480810) wasn't successful (a bug mostly). So we
need to trace back on what happened on a close of this session and why it
did not close.

Grepping all the server logs for session id (0x32524d5440e022a, this is the
hex of the the above decimal number) might give us some insight into this.


Thanks
mahadev

On 12/15/09 7:44 AM, "Benjamin Reed" <br...@yahoo-inc.com> wrote:

does  se/diserver_tc/diserver_tc0000000067 appear on all three servers?

ben

Qian Ye wrote:
Hi guys:

I find a very strange scenario today, I'm not sure how it happen, I just
found it like this. Maybe you can give me some information about it, my
Zookeeper Server is version 3.2.1.

My Zookeeper cluster contains three servers, with ip:
10.81.12.144,10.81.12.145,10.81.12.141. I wrote a client to create ephemeral
node under znode: *se/diserver_tc*. The client runs on the server with ip
10.81.13.173. The client can create a ephemeral node on zookeeper server and
write the host ip (10.81.13.173) in to the node as its data. There is only
one client process can be running at a time, because the client will listen
to a certain port.

It is strange that I found there were two ephemeral node with the ip
10.81.13.173 under znode se/diserver_tc.
*se/diserver_tc/diserver_tc0000000067*
STAT:
        czxid: 124554079820
        mzxid: 124554079820
        ctime: 1260609598547
        mtime: 1260609598547
        version: 0
        cversion: 0
        aversion: 0
        ephemeralOwner: 226627854640480810
        dataLength: 92
        numChildren: 0
        pzxid: 124554079820

*se/diserver_tc/diserver_tc0000000095
*STAT:
        czxid: 128849019107
        mzxid: 128849019107
        ctime: 1260772197356
        mtime: 1260772197356
        version: 0
        cversion: 0
        aversion: 0
        ephemeralOwner: 154673159808876591
        dataLength: 92
        numChildren: 0
        pzxid: 128849019107*
*
There are TWO with different session id! And after I kill the client process
on the server 10.81.13.173, the *se/diserver_tc/diserver_tc0000000095 *node
disappear, but the *se/diserver_tc/diserver_tc0000000067 *stay the same.
That means it is not my coding mistake to create the node twice. I checked
several times and I'm sure that there is no another client instance running.
And I use the 'stat' command to check the three zookeeper servers, and there
is no client from 10.81.13.173,

$echo stat | nc 10.81.12.144 2181
Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT
Clients:
 /10.81.13.173:35676[1](queued=0,recved=0,sent=0) *# it is caused by the nc
process*

Latency min/avg/max: 0/3/254
Received: 11081
Sent: 0
Outstanding: 0
Zxid: 0x1e000001f5
Mode: follower
*Node count: 32
*
$ echo stat | nc 10.81.12.141 2181
Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT
Clients:
 /10.81.12.152:58110[1](queued=0,recved=10374,sent=0)
 /10.81.13.173:35677[1](queued=0,recved=0,sent=0) *# it is caused by the nc
process*

Latency min/avg/max: 0/0/37
Received: 37128
Sent: 0
Outstanding: 0
Zxid: 0x1e000001f5
Mode: follower
*Node count: 26*

$ echo stat | nc 10.81.12.145 2181
Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT
Clients:
 /10.81.12.153:19130[1](queued=0,recved=10624,sent=0)
 /10.81.13.173:35678[1](queued=0,recved=0,sent=0) *# it is caused by the nc
process*

Latency min/avg/max: 0/2/213
Received: 26700
Sent: 0
Outstanding: 0
Zxid: 0x1e000001f5
Mode: leader
*Node count: 26*

The three 'stat' commands show different Node count! Just cannot understand
how it happened, can anyone give me some explanation about it?



Reply via email to