[ https://issues.apache.org/jira/browse/ZOOKEEPER-628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791537#action_12791537 ]
Patrick Hunt commented on ZOOKEEPER-628: ---------------------------------------- Qian this is great information. Can you tar/gzip the log files from all the servers and attach them? Any/all log files you have would be useful for us to see. > the ephemeral node wouldn't disapper due to session close error > --------------------------------------------------------------- > > Key: ZOOKEEPER-628 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-628 > Project: Zookeeper > Issue Type: Bug > Components: server > Affects Versions: 3.2.1 > Environment: Linux 2.6.9 x86_64 > Reporter: Qian Ye > > I find a very strange scenario today, I'm not sure how it happen, I just > found it like this. Maybe you can give me some information about it, my > Zookeeper Server is version 3.2.1. > My Zookeeper cluster contains three servers, with ip: > 10.81.12.144,10.81.12.145,10.81.12.141. I wrote a client to create ephemeral > node under znode: se/diserver_tc. The client runs on the server with ip > 10.81.13.173. The client can create a ephemeral node on zookeeper server and > write the host ip (10.81.13.173) in to the node as its data. There is only > one client process can be running at a time, because the client will listen > to a certain port. > It is strange that I found there were two ephemeral node with the ip > 10.81.13.173 under znode se/diserver_tc. > se/diserver_tc/diserver_tc0000000067 > STAT: > czxid: 124554079820 > mzxid: 124554079820 > ctime: 1260609598547 > mtime: 1260609598547 > version: 0 > cversion: 0 > aversion: 0 > ephemeralOwner: 226627854640480810 > dataLength: 92 > numChildren: 0 > pzxid: 124554079820 > se/diserver_tc/diserver_tc0000000095 > STAT: > czxid: 128849019107 > mzxid: 128849019107 > ctime: 1260772197356 > mtime: 1260772197356 > version: 0 > cversion: 0 > aversion: 0 > ephemeralOwner: 154673159808876591 > dataLength: 92 > numChildren: 0 > pzxid: 128849019107 > There are TWO with different session id! And after I kill the client process > on the server 10.81.13.173, the se/diserver_tc/diserver_tc0000000095 node > disappear, but the se/diserver_tc/diserver_tc0000000067 stay the same. That > means it is not my coding mistake to create the node twice. I checked several > times and I'm sure that there is no another client instance running. And I > use the 'stat' command to check the three zookeeper servers, and there is no > client from 10.81.13.173, > $echo stat | nc 10.81.12.144 2181 > Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT > Clients: > /10.81.13.173:35676[1](queued=0,recved=0,sent=0) # it is caused by the nc > process > Latency min/avg/max: 0/3/254 > Received: 11081 > Sent: 0 > Outstanding: 0 > Zxid: 0x1e000001f5 > Mode: follower > Node count: 32 > $ echo stat | nc 10.81.12.141 2181 > Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT > Clients: > /10.81.12.152:58110[1](queued=0,recved=10374,sent=0) > /10.81.13.173:35677[1](queued=0,recved=0,sent=0) # it is caused by the nc > process > Latency min/avg/max: 0/0/37 > Received: 37128 > Sent: 0 > Outstanding: 0 > Zxid: 0x1e000001f5 > Mode: follower > Node count: 26 > $ echo stat | nc 10.81.12.145 2181 > Zookeeper version: 3.2.1-808558, built on 08/27/2009 18:48 GMT > Clients: > /10.81.12.153:19130[1](queued=0,recved=10624,sent=0) > /10.81.13.173:35678[1](queued=0,recved=0,sent=0) # it is caused by the nc > process > Latency min/avg/max: 0/2/213 > Received: 26700 > Sent: 0 > Outstanding: 0 > Zxid: 0x1e000001f5 > Mode: leader > Node count: 26 > The three 'stat' commands show different Node count! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.